m@ksim.pro
Blog

Notes on data, AI, IT and security

No marketing fog. The way I think about real problems with founders and managers.

Data

A data warehouse without a data team

How a small company can build a manageable data warehouse without hiring a BI department or buying an expensive platform.

Read
Data

Is MapReduce enough: where the batch model starts hurting business

Some business scenarios already demand a different computation tempo. Not because batch is bad, but because the latency has become too expensive.

Read
Data

Spark, Hadoop, or MPP: choose the workload type, not the brand

Different tasks need different computation and storage modes. Getting the platform wrong costs more than it appears at the start.

Read
Data

Technical debt in data pipelines: why \"we will rewrite it later\" almost never happens

Data pipelines age faster than they appear to, especially when nobody owns the schema. Deferring the refactor has a concrete cost.

Read
Data

KPI worth fighting for: how to tell signal from dashboard decoration

A KPI only matters if it can change a decision. Everything else is decoration on a dashboard.

Read
Data

The event log as source of truth: why event-driven beats point-to-point integration

How shifting from bilateral API calls to a shared event journal simplifies system coupling and removes the fragility of double-writes.

Read
Data

Experiments and A/B thinking: what digital products can teach everyone else

Not every decision needs a year-long project. Some of them should be tested with fast, cheap experiments - even in non-digital environments.

Read
Data

Data scientist, analyst, engineer: it is time to tell the roles apart

Why 'data person' is no longer a precise enough role, and how blurred expectations sink teams before work even begins.

Read
Data

ODS, data marts, DWH: why you cannot build an analytics landscape with one abbreviation

Each layer of an analytics architecture solves a different problem, operates at a different speed, and needs its own owner. Mixing the layers is the source of most analytics problems.

Read
Data

Streaming vs batch: when real-time is justified and when nightly batch is the honest choice

Before choosing a data processing architecture, it is worth honestly calculating the cost of latency - and finding out whether the business actually pays for it.

Read
Data

Sales, service, logistics: where predictive analytics earns its keep

An accurate model is only the beginning. Profit appears where there is an action scenario and a threshold that triggers it.

Read
Data

Processing speed as an argument: what faster data computation changes

Data engineering is getting a new balance between batch and faster processing - and that changes which problems become solvable at all.

Read