Notes on data, AI, IT
and security
No marketing fog. The way I think about real problems with founders and managers.
Three months of remote work revealed the data gaps
Moving to remote work showed where data in companies lives in people's heads rather than in systems. How to read these signals and what to do with them.
Your data stack in a crisis: what to cut, what to hold
When teams shrink and projects freeze, you still need data. How to prioritise your data infrastructure during an unstable period.
Data producer-consumer contracts: the unglamorous fix that works
Why data pipelines break most often at team boundaries, and how formalising the contract between producers and consumers prevents the majority of those breaks.
Customer data in a crisis: how companies lose the signal
Why customer behaviour analytics stops working during a sharp change in context - and what to do about it.
Data backlog triage when the team just went remote
How to decide in the first weeks of a distributed sprint what data work to continue, pause, and drop - without a formal process to fall back on.
A single source of truth in reporting: why it is so hard to create
Why numbers diverge across different reports in the same company, and what actually fixes it.
Event-driven architecture and data contracts: why data is no longer a by-product
Moving from synchronous integrations to event-driven architecture changes how data is treated. Data contracts become a first-class engineering artefact.
Data as a product: why you cannot put one team in charge of all the data
When analytics stops working, the problem is usually not the tools. How to distribute data responsibility across teams.
Why ML teams keep rewriting the same thing over and over
Feature stores and feature management in machine learning: where the duplication comes from and how to get rid of it.
Stream processing is an operations question, not an architecture question
Why the decision to move to streaming should start with understanding operational load, not with choosing a technology.
A data lake without governance is a swamp, not an asset
Why a data lake without access policy and governance turns into unmanageable storage that nobody can get trustworthy data from.
Data ownership matters more than a data platform
Why companies buy expensive data platforms and end up with the same problems - and what needs to be resolved before choosing a tool.