m@ksim.pro
Back to all posts
Data 3 min read

Data scientist, analyst, engineer: it is time to tell the roles apart

Why 'data person' is no longer a precise enough role, and how blurred expectations sink teams before work even begins.

When a company decides to get serious about data, the first hire almost always sounds the same: "we need a data person." Sometimes the title is analyst, sometimes data scientist, sometimes just "BI specialist." The assumption is that one person will do everything - set up collection, build models, create dashboards, and explain the results to the business.

That does not work. And the longer a company clings to this construction, the more it costs.

Three different professions

Over the past few years it has become clear that "working with data" has split into three fundamentally different occupations.

A data engineer builds and maintains infrastructure: pipelines, warehouses, integrations. The deliverable is a reliable data flow that runs every day without manual intervention. It is systematic, engineering work, close to software development.

A data analyst is responsible for interpretation. They take data that already sits in a convenient place, build reports, look for patterns, and form conclusions for decision-making - including the question of where the number in the report came from and who owns it. It is work that requires understanding business context and the ability to speak the language of executives and managers.

A data scientist is a separate specialty. They build predictive models, work with statistics, and develop algorithms. It is closer to research, requires a solid mathematical foundation, and demands comfort with uncertainty.

There is overlap between these roles, but they have different centers of gravity, different skills, and different working rhythms.

What happens when roles are blended

When one person is required to do all three, one of two things usually follows. Either they gravitate toward what they are naturally good at, and everything else degrades. Or they spread their time evenly across everything and do none of it well enough.

Infrastructure built by an analyst "on the side" is typically unreliable and does not scale. Analytics written by an engineer in spare time tends to be technically clean but disconnected from real business questions. Models built without clean data produce unpredictable results - dirty master data breaks any BI before you even get to the model.

The company spends resources, gets none of the components in a working state, and a year later asks again why "data isn't working."

Why this is a systematic mistake

The confusion has an understandable origin. A few years ago the field was genuinely small, and individuals did do everything. The term "data scientist" emerged precisely to describe the rare person who combined engineering, statistics, and business understanding. Such people are few, they are expensive, and the market quickly started using the label to cover a much broader set of job descriptions.

Now, when the data is larger, the tooling is richer, and the stakes are higher, that universality stops being an asset. It becomes a risk.

A practical guide

Before hiring or forming a data team, it is worth asking a few specific questions:

  1. What data do we already have, and how reliably does it land in the system every day?
  2. What decisions do we want to make with data - and who will make them?
  3. Do we have a task that requires a predictive model, or is good analytics enough?
  4. Who in the team will be the end user of the result, and what language do they speak?

The answers almost always show who to hire first. More often than not it is an engineer or an analyst - not a data scientist. A data scientist makes sense when there is already clean data and a specific problem waiting. Until then, they spend their time on things that are not their core skill.

Calling all three roles by one name is convenient. But that convenience is expensive in practice.

Back to all posts
Contact

If this resonated, write to me. I reply personally.

WhatsApp