Data April 2, 2020 3 min read

Customer data in a crisis: how companies lose the signal

Why customer behaviour analytics stops working during a sharp change in context - and what to do about it.

Companies that have been investing in customer behaviour analytics over the past few years are now running into an uncomfortable discovery. Models that worked well a month ago are producing strange results. Forecasts have diverged from reality. Segments are behaving unpredictably.

This is not a technical failure. It is what is called a distribution shift - the distribution of input data has moved. The world changed, and the models do not know it yet.

Why analytics loses the signal

Most analytical and ML models are trained on historical data collected under normal conditions. That data reflects how people behave in a stable environment: usual purchases, usual patterns, usual priorities.

Now the conditions have changed fundamentally. People are at home. Many have lost a source of income or are anxious about it. Priorities have shifted. Consumption channels have changed. The patterns the model considered predictive - "someone who did X usually does Y" - no longer hold the same way.

On top of that, the signal itself has changed: data from March-April 2020 is anomalous. If you use it to retrain a model, you risk teaching the model to respond to a crisis that may look nothing like the next one.

What specific problems arise

Recommendation systems. The logic of "people with similar behaviour bought this" breaks when everyone changes their behaviour simultaneously. Recommendations start looking irrelevant.

Demand forecasts. If the forecast was built on seasonal patterns and trends, it is not working now. Demand for some goods has surged abnormally, for others it has collapsed. Historical averages do not describe the current situation.

Churn models. Signs that used to predict customer churn are now appearing everywhere - even among loyal customers. Distinguishing "temporary behaviour change due to crisis" from "real churn" is hard without additional context.

What to do with analytics during this period

A few practical approaches:

Separate pre- and post-crisis data. Explicitly mark the boundary of the context change in the data. Do not mix February and April patterns in the same models without understanding that they describe different realities.

Increase the weight of fresh data or switch to rules. In a situation where historical patterns are unreliable, it sometimes makes more sense to temporarily replace predictive models with explicit rules based on current understanding of the situation.

Look at absolute numbers, not changes. When everything has changed, percentage changes relative to the prior period say little. Absolute levels and comparison against manually adjusted expectations matter more.

Avoid major model retraining right now. Data from the crisis period is an anomaly, not a new norm. The decision of whether to include it in training is better made later, when there is more clarity.

The main takeaway

Data analytics in a crisis does not stop being useful. But it requires greater care in interpretation. The question "what is happening right now" matters more than "what did the model predict". And right now is not a good time to automatically trust algorithms in places where you previously did.

Back to all posts

Contact

Why analytics loses the signal

What specific problems arise

What to do with analytics during this period

The main takeaway

If this resonated, write to me. I reply personally.