m@ksim.pro
Back to all posts
Data 4 min read

Big data without the magic: where to start when you have a lot of data and little value

Why inventorying your sources and metrics matters more than buying a platform, and how to get to actual value from data.

Big data is going through a period where everyone talks about it and few people know what to do with it. Conferences promise a revolution. Vendors offer platforms. CIOs get requests from leadership to "sort out big data."

In reality most companies have a different problem. They already have a lot of data - accumulating in CRM, ERP, logs, spreadsheets, email, file servers. It just produces no value. And the answer here is not to buy a platform for even more data.

Where "lots of data, no value" comes from

This is almost always the result of several problems operating at the same time.

The data exists but is fragmented. Each system holds its piece of the picture, the pieces are not connected, and getting a complete answer means assembling data from several places manually - every time, from scratch.

The data exists but no one knows which version to trust. The same metrics are calculated differently across departments. There is no agreed definition of what counts as a "sale," a "customer," or an "active user."

The data exists but there are no questions. Information is collected because "it might be useful someday," not because there are specific management decisions that need answers.

The data exists but no one maintains it. Databases age. Reference tables are not updated. The collection process no longer reflects how the business actually works.

Why a platform does not fix these problems

Hadoop, Cassandra, any other distributed system - these are tools for specific scales and specific load scenarios. They are not tools for bringing order to data.

If data is fragmented, a platform will not unify it on its own. If metric definitions are not agreed across departments, a new warehouse will not change that. If there is no understanding of which questions need answers, any analytical infrastructure will sit idle.

Worse: migrating to a new platform often delays solving the real problems. A company spends several months on implementation, then discovers it is now storing the same chaotic data in a more expensive system.

Where to actually start

Start with an inventory - not of hardware, but of data sources and metrics.

Source inventory: what do you have, where does it physically live, who is responsible for its accuracy, how often is it updated, how current is it.

Metric inventory: which indicators are actually used to make decisions, how are they calculated, and does everyone who uses them share the same definition.

This is unglamorous work. It does not look like a "big data project." But without it, everything that follows is building on a bad foundation.

What a good next step looks like

After an inventory, three or four problems that genuinely block effective use of data usually become visible. This might be: no single source of truth for the customer base; sales figures calculated differently in CRM versus the financial system; no change history for key records.

Each of these can be solved locally - without buying a platform. Normalise the customer directory. Agree on the definition of a sale and write it into a standard. Set up history tracking for the changes you need, in the systems you already have.

Only after basic order exists, and there are specific analytical tasks with real data volumes - only then does it make sense to talk about a platform.

A simple readiness check

Before deciding on a platform or a large data project, I recommend answering honestly:

  1. Can you name three key metrics for your business today and explain precisely how each is calculated?
  2. If two analysts take the same data for the same period, will they produce the same answer?
  3. Do you know who in the company is accountable for the accuracy of your key data?
  4. Do you have a list of specific questions you want data to answer, but currently cannot?

If the first three answers are "no" or "not sure" - the project does not start with a platform. It starts with bringing order to what you already have.

Back to all posts
Contact

If this resonated, write to me. I reply personally.

WhatsApp