Business continuity after the pandemic: what to revisit in the architecture
What conclusions about IT architecture and operational resilience are worth drawing from the first wave - before normal routine returns.
Lockdown restrictions in various countries are beginning to ease, and for some companies the question of returning to the office is arising. This is a good moment to pause and record what the last quarter revealed, while the observations are still fresh.
Not to write a polished report. But to make a few decisions that will improve resilience for the next crisis - whatever form it takes.
What was discovered about dependencies
Every company that went through the sudden shift to remote work now knows clearly a few things about its IT infrastructure that it either did not know before or did not want to acknowledge.
Where the critical dependencies on physical presence were - printers, rack servers, equipment that needs to be touched by hand. Where the dependency on a specific person in a specific place was hidden rather than explicit.
Which processes were documented and which lived only in verbal handoffs. This became painful at the first cases of people going on sick leave or into isolation.
How dependent the company is on specific suppliers who also ran into trouble.
What to record right now
Every company has its own list. But several categories repeat almost everywhere:
Remote manageability of infrastructure. What parts of critical infrastructure require physical access and have no remote management? This does not mean everything needs to move to the cloud immediately. It means understanding where that dependency exists and making a conscious decision.
Single points of failure in processes. Where in operational processes are there tasks that only one person knows how to do? This risk existed before the pandemic, but the pandemic exposed it.
Process documentation. Which processes are still not documented? After the recent experience, the team has fresh motivation to fix this - before that motivation fades.
System accessibility without VPN. Which systems do employees need in remote mode but which are only accessible through an overloaded VPN? Is there a case for moving them somewhere directly accessible?
What not to do under the impression of the crisis
A crisis is a bad time for architectural decisions made under pressure. "Let's move everything to the cloud right now" or "let's buy a backup data centre" - these are decisions that require calm analysis, not a reaction to acute stress.
The impulse toward change is high right now - that is good. But the changes should be directed toward specific improvements, not toward large unconsidered projects.
Three questions for the executive
Before returning to the normal rhythm:
- What of the things that did not work, or worked poorly, in March-April do we want to fix before next season - regardless of whether there is another pandemic wave?
- What dependencies did we discover that we did not know about - and that we want to reduce?
- What did we learn about our operational resilience that is worth writing down before it is forgotten?
Business continuity is not a document for a regulator. It is the ability to operate in conditions that were not planned for. The last few months provided more practical material for this than most theoretical exercises.