Data November 10, 2022 3 min read

Analytic database versus operational database: when to separate them

Why one database cannot serve both transactions and analytics well - and how to recognise when the time to separate them has arrived.

When a company first starts collecting data, a single database looks like a sensible choice. Everything in one place, queries are straightforward, nothing extraneous. This works while the company is small.

Then things get interesting. Analysts run heavy reports directly against the production database - and this slows down the main application. Or the reverse: transactional load makes it impossible to build a clean report, because by the time the query finishes running, the data has already changed.

This is not a problem with any particular database technology, and it is not an architectural mistake. It is a fundamental conflict between two different tasks.

Why an operational database needs one thing and an analytical database needs another

An operational database (OLTP - Online Transaction Processing) is optimised for fast read and write operations: add an order, update stock, record a payment. Thousands of short transactions per second. The speed of an individual operation is critical.

An analytical database (OLAP - Online Analytical Processing) is optimised for something different: scan millions of rows, aggregate, compare periods, find patterns. Fewer transactions, but each works with large data volumes. The execution time of a complex query is what matters.

These two tasks are optimised by opposite means. An operational database stores data row by row - fast to update individual records. An analytical database stores data column by column - fast to aggregate a single attribute across all rows. One tool cannot be good at both modes simultaneously.

When this starts to hurt

Signs that separation is necessary:

Analytical queries slow down the main application. This is the first symptom - you can try to optimise, but the underlying problem remains.

Reports are built on stale data. To avoid loading the operational database, analysts work from nightly exports and see yesterday's picture.

Analysts are afraid their queries will break something. This is already a cultural problem, signalling that the architecture does not allow the team to work properly.

Historical data cannot be stored. The operational database holds the current state. Analytics needs history: how a metric changed, what things looked like a month ago, where the trend is going.

What happens when you separate them

Separation means having two layers with different responsibilities.

The operational database keeps working as before - transactions, current data, high write speed.

The analytical layer (a DWH, data mart, or cloud analytical database) receives data from the operational database on a schedule - via ETL processes. It is optimised for reading, stores history, and allows complex queries without risk to the main application.

A practical question that always arises: is real-time data needed, or is a nightly sync enough? For most management reporting - nightly is enough. For operational dashboards - that needs a separate conversation.

A practical test

Ask your team a few questions:

Do analytical queries affect the performance of the main application?
How fresh is the data used in management reporting - and are you satisfied with that?
Can you reconstruct the historical state of the data as of a specific date?
Can analysts work independently, without coordinating every query with the development team?

If the answers to these questions are not satisfying - separating the operational and analytical layers is worth considering as an architectural priority for the coming year.

Back to all posts

Contact

Why an operational database needs one thing and an analytical database needs another

When this starts to hurt

What happens when you separate them

A practical test

If this resonated, write to me. I reply personally.