Security March 15, 2021 3 min read

Hafnium and Exchange: the patch that waited too long

The March 2021 Microsoft Exchange mass exploitation showed that patch management is not a technical task - it is an organisational one.

In early March 2021, Microsoft released emergency patches for four critical zero-day vulnerabilities in Exchange Server. Within days, tens of thousands of organisations worldwide were compromised. The attacker group named Hafnium had been exploiting these holes quietly for weeks before the patch arrived. After the patch dropped, dozens of other groups rushed in before most organisations had applied it.

I watched several mid-size companies go through the incident response scramble that followed. The pattern was the same in every case: the vulnerability was known, the patch was available, and the server was still unpatched two weeks later.

Why patching is slow

The usual explanation is "change management." In practice it is something simpler: no one owns the decision. The security team flags the alert. The infrastructure team says they need a maintenance window. The business says it is a busy quarter. The legal team asks for a risk assessment. Three weeks pass.

Exchange is not a fringe system. It handles email for the whole company. People are afraid to touch it. That fear, without a forcing function, becomes a standing delay.

What the attack actually looked like

The Hafnium chain allowed an attacker to authenticate as any user, write arbitrary files to the server, and execute code remotely. Once inside Exchange, lateral movement into Active Directory and the broader network was straightforward. Many compromises were discovered not in hours but in weeks, after the attacker had already mapped the environment.

The attack surface was not exotic. It was a server sitting on port 443, reachable from the internet, running software with a known critical flaw.

The organisational problem underneath

Patch management fails at the decision layer, not the technical layer. The typical gaps I see:

No defined SLA by severity class. A critical remote-code-execution patch and a low-severity informational update sit in the same queue.
No owner with authority to approve an emergency maintenance window without a committee.
No tested rollback procedure for the systems that matter most.
Inventory that is six months stale - no one knows which Exchange version is running where.

These are not hard problems. They are neglected ones.

What a minimal working process looks like

I do not advocate for a full ITSM overhaul. The minimum that actually works:

Severity tiers with patching deadlines: critical means 72 hours for internet-facing systems, not "next sprint."
One named person per critical system with authority to schedule an emergency window.
A quarterly drill: pick a random server, patch it, confirm it comes back, document the time it took.
Offline inventory review once a quarter against what is actually reachable from the internet.

None of this requires expensive tooling. It requires someone to write it down and someone else to check that it happened.

The real cost of waiting

The companies that were compromised in the Hafnium wave did not lack security awareness. They had firewalls, antivirus, and security policies. What they lacked was a process that made patching faster than the attacker's exploitation window.

That window, in March 2021, was measured in days. In future incidents it may be measured in hours.

Back to all posts

Contact

Why patching is slow

What the attack actually looked like

The organisational problem underneath

What a minimal working process looks like

The real cost of waiting

If this resonated, write to me. I reply personally.