m@ksim.pro
Blog

Notes on data, AI, IT and security

No marketing fog. The way I think about real problems with founders and managers.

Data

An ETL pipeline is a production line - monitor it accordingly

Why ETL failures are an operational incident, not a technical glitch, and how to build visibility into data flows.

Read
Data

Why feature engineering still matters in the deep learning era

Deep learning automates feature extraction - but it does not remove the need to think carefully about what data you feed into the model.

Read
Data

Who owns data quality in a company that is not a data company

Data quality problems are common. Accountability for them is rare. A look at how to assign ownership without creating a bureaucratic layer that nobody uses.

Read
Data

Streaming data: when you need it and when batch is enough

How to decide whether your company needs streaming data processing, or whether that is unnecessary complexity for tasks that batch loading handles perfectly well.

Read
Data

Data warehouse or data lake: how to make the right call

A breakdown of two architectural approaches to corporate data storage and the criteria that actually matter for mid-size companies.

Read
Data

Five years of big data: what survived and what did not

A retrospective look at the big data wave: which promises were realised, which turned out to be hype, and what from that period is still worth applying today.

Read
Data

A single source of truth for operational reporting

Why most companies lack a single authoritative number, and what it takes to create one - without a large IT project.

Read
Data

A data pipeline is a production system, not a script

Why companies lose trust in their analytics when they treat data pipelines as one-off tasks rather than operated systems.

Read
Data

Data quality: four metrics that are worth tracking in practice

Most data quality programs stall because the metrics are too abstract. Here are four concrete measurements that show up problems early and connect to business outcomes.

Read
Data

PostgreSQL as your main database: what changed for business

Why PostgreSQL stopped being a niche choice and what to verify before making it the foundation of a corporate architecture.

Read
Data

Data lake: questions to ask before you start building

Why data lake projects often turn into data swamps, and what founders and managers should ask before committing budget.

Read
Data

Real-time data: when the business actually needs it, and when it is over-engineered

How to tell which business tasks genuinely need stream processing, and which ones work perfectly well with regular batch updates.

Read