Notes on data, AI, IT
and security
No marketing fog. The way I think about real problems with founders and managers.
The critical data map: which datasets actually hold an operating company together
Management does not need all data - it needs a list of the information pillars of the business. How to find those pillars and what to do with them.
From logs to metrics and back: building observability without noise
Not all events are equally useful. How to design system observability so it produces signal rather than just another stream of garbage.
Near real-time: when a business actually needs streaming, and when 15 minutes is fine
Not every online analytics setup makes economic sense. A look at when streaming data is justified and when it is an expensive illusion of speed.
Amazon Redshift and the new economics of data warehousing
Cloud MPP storage changes not just the stack but the psychology of a pilot. Why the right question is no longer 'can we afford this' but 'where do we start'.
Raw data layer before data lake: when it makes sense and how to keep it from becoming a swamp
Storing everything in one place is not a strategy. Without catalogs, owners, and metadata you do not get a lake - you get a swamp.
Columnar storage and a new pace of analytics: why a months-long data warehouse build is already strange
When analytics can be up and running in days rather than months, both business expectations and the right way to design storage change.
You can no longer store everything: the economics of archives, logs, and history layers
Cheap disk lowers the cost of writing data, but not the cost of searching it, maintaining it, or understanding what is actually inside.
Predictive maintenance without the hype: start with failure history, not neural networks
Repair logs and the cost of downtime matter more than fashionable algorithms. Why failure data is the real first step.
Excel as shadow IT: banning it is pointless, rethinking the process is not
Spreadsheets live where corporate systems are too slow or too rigid. Here is what to do about that.
Data lineage: where the number in the report came from, and who owns it
Tracing metrics back to their source is not a technical nicety - it is the foundation of trust in analytics.
Text analytics without a silver bullet: where the real value is in reviews and tickets
Why processing customer text starts not with understanding language, but with routing and classifying reasons.
Telemetry architecture: collecting sensor data so it is still useful in three years
Why sensor data needs to be designed as a long-term asset, not launched as a one-off pilot that cannot be reused later.