Notes on data, AI, IT
and security
No marketing fog. The way I think about real problems with founders and managers.
Gradient boosting: the machine learning that already works in production
Why ensemble methods - random forests and gradient boosting - became the first real ML for business, and how a manager should think about them.
Machine translation is improving, but the enterprise gap remains
Why impressive results in neural translation do not mean a company can remove translators from its workflows.
Deep learning: what is behind the hype and what is not ready yet
What the current wave of interest in neural networks means for companies that do not have a research lab.
Recommendation systems: what they need before they work
What a recommendation system actually requires to function, and why most projects stumble before they ever reach the algorithm.
Neural translation is entering product territory
What changed in machine translation in 2014 and why it matters for companies dealing with large volumes of text.
Text analysis becomes practical: what it means for business
Natural language processing tools have reached the point where they can be used without a research lab. What to do with that.
Machine learning for mid-size business: what is real, what is not
An honest look at which problems machine learning actually solves for companies without research labs, and which ones remain academic.
ML in fraud detection: where AI saves money and where it only complicates the investigation
A look at the decision loop and the explainability problem in machine-learning-based anti-fraud systems.
Model quality vs data freshness: what matters more in applied analytics
In most real tasks, the winner is not a smarter model but a better-managed operational loop that keeps data current.
Reinforcement learning and Atari: not about games, but about a class of problems
Why DeepMind's results on a games console matter not for entertainment, but as a signal about a whole class of optimization problems in business.
Human in the loop: why automated decisions increasingly need a person nearby
The higher the risk of a decision, the more important deliberate semi-automation becomes. Full system autonomy and full manual control are both losing strategies.
word2vec and the new semantics of search: why text collections will soon think in proximity
Vector word representations open a new level of search, matching, and recommendations. What this means for companies that hold large text collections.