ODS, data marts, DWH: why you cannot build an analytics landscape with one abbreviation
Each layer of an analytics architecture solves a different problem, operates at a different speed, and needs its own owner. Mixing the layers is the source of most analytics problems.
When a company starts talking about building analytics infrastructure, abbreviations arrive quickly. ODS, DWH, data marts. Sometimes they are used as synonyms. Sometimes as interchangeable options from which you must pick one. This leads to mistakes, because each of these layers solves a fundamentally different problem.
Building analytics with "one abbreviation" means either overengineering the system where something simple would do, or underengineering it where serious architecture is required. Both mistakes are expensive.
What each layer actually does
An ODS - operational data store - is a layer that aggregates data from several operational systems into one place. Its job is to remove the need to go and fetch data from multiple different sources. An ODS is typically updated frequently, sometimes in near real time. Data here is close to its original form - not heavily transformed, not historicised in depth. This is a layer for operational access, not long-term analysis.
A DWH - data warehouse - is a different matter. Its job is to store history, to make reports reproducible, and to allow questions like "what was the situation a year ago?" A DWH is updated on a schedule - usually no more than once a day. Data here is transformed, cleaned, and reconciled across sources. This is an expensive layer to build and maintain, and it is justified when historical data and cross-source consistency are critical for decision-making. Amazon Redshift shifted the economics of building a data warehouse significantly - but the question of whether you need this layer at all did not change.
Data marts are the layer closest to the end consumer: an analyst, a manager, an automated report. A data mart is a pre-calculated, purpose-built slice of data optimised for a specific business question. It is fast because the data is already prepared. But it depends on what lies below it - ODS or DWH - and does not replace those layers.
Why mixing the layers creates problems
The most common mistake is building data marts directly on top of operational systems, bypassing intermediate layers. It works at the start: fast, no extra steps. But problems emerge over time.
If the operational system changes its structure - the mart breaks. If you need to compare data from last year - the source system no longer looks the way it did. If two analysts calculate the same metric from different sources - they get different answers.
Another mistake is building a DWH where an ODS or a data mart would be sufficient. A DWH requires serious investment in data modelling, ETL processes, and ongoing maintenance. If the company is not making decisions based on multi-year historical series, a full DWH may be excessive. This is the same kind of problem as deciding when a cluster makes sense and when a regular DWH is enough - viewed from the infrastructure side.
Each layer needs an owner
Technical layers without organisational owners do not last. The ODS gets updated too infrequently or drifts out of sync with its sources. The DWH accumulates broken transformations. Data marts stop matching the actual business questions.
An owner of a layer is not the person who maintains it technically. It is the person who answers the question: "is this data correct?" For the ODS, that is usually IT or the data team. For a data mart - the business function that uses it. For the DWH - most often a central analytics function or IT with business participation.
Without this separation, data gradually becomes something nobody is willing to stand behind.
Update frequency is not a technical choice
One of the key attributes of each layer is its update cadence. The ODS may refresh several times a day. The DWH once a day or less. Data marts on a schedule tied to the business cycle.
This is not purely a technical parameter - it is an agreement with the business about how fresh the data needs to be for each task. If an analyst expects data in a mart to be refreshed every half hour, but the update happens once a day - that is not a technical failure. It is a mismatch in expectations that should have been resolved at design time.
Practical questions before designing
Before making architectural decisions, it is worth answering several questions:
- Which business questions do we need to be able to answer, and how often are they asked?
- Do we need historical data - and for how far back?
- What is the acceptable data age for each task?
- Who will be accountable for data correctness in each layer?
- What happens if an operational system changes its structure?
The answers to these questions determine which layers are needed - not the other way around.