Data November 28, 2016 3 min read

Real-time data and right-time data: the difference and why it matters

Not every task requires real-time data. Getting this choice wrong costs money and complicates architecture without benefit.

When a company starts a serious conversation about analytics and data, sooner or later a request for "real-time data" appears. A manager wants to see the current picture, not yesterday's. That is an understandable desire.

But behind the words "real time" lies a technical and operational decision whose cost depends strongly on how literally we interpret the requirement. The difference between "data updates every 15 minutes" and "data updates every second" is not an implementation detail. It is fundamentally different architecture with different cost and complexity.

What "right time" means

True real-time means data available within seconds or fractions of a second of an event. This is needed for specific tasks: fraud detection at the moment of a transaction, control of a production process with immediate feedback, trading systems. There, a delay of a few minutes makes the data useless.

Most management tasks work differently. A manager looks at a sales report not to make a decision in the next 30 seconds. They need a current picture - but "current" here may mean "updated an hour ago" or "updated this morning".

I use the concept of "right time" - data updates at a frequency that matches the frequency of decision-making. For a daily operational report, that is once per day. For monitoring call centre performance, it might be every 15 minutes. For managing stock levels, once an hour or every few hours.

Why choosing the right frequency matters

Real-time data architecture is significantly more complex and expensive than batch processing. It requires specialised infrastructure - streaming platforms such as Kafka or Spark Streaming, different storage schemes, different approaches to monitoring and reliability.

If you build real-time architecture for a task that does not require second-level freshness, you are overpaying for infrastructure and adding complexity without practical benefit.

The opposite error also occurs: building batch processing with daily updates for a task where a decision needs to be made within the day - and getting analytics that always describes yesterday.

How to determine the right frequency for a task

A few questions that help:

How quickly after an event occurs does a decision need to be made? For a fraudulent transaction - immediately. For a performance deviation on the production line - within an hour. For a shift in sales dynamics - within a day or even a week.

How often does the person using this data look at the dashboard or report? If once in the morning, there is no point updating more frequently than overnight.

What would happen if the data were delayed by an hour, by three hours, by a day? If the answer is "nothing critical", that is a signal that real-time is not needed.

The practical conclusion

Before investing in streaming infrastructure, it is worth going through the key analytics tasks and honestly answering: what data latency is acceptable for each of them?

It often turns out that 80-90% of tasks are perfectly solved by batch processing at a sensible frequency - and that solution is cheaper, more reliable, and simpler to maintain. For the remaining 10-20% with a genuine need for low latency, build a specialised solution.

Mixing everything into one "real-time" pipeline for the sake of uniformity is one of the most expensive architectural mistakes in analytics.

Back to all posts

Contact

What "right time" means

Why choosing the right frequency matters

How to determine the right frequency for a task

The practical conclusion

If this resonated, write to me. I reply personally.