IT March 9, 2012 3 min read

Backup after virtualisation: why old schemes no longer hold

Having backup copies and being able to restore from them are two different things. Virtualisation changes both, and old backup approaches often create the illusion of protection where there is none.

In most companies backup is treated as a solved problem. There is an agent on every server, there is a schedule, there is tape or a network storage. Box ticked. When virtualisation arrives, that tick is often left in place - and that is a mistake. Virtualisation does not change the fundamental model of infrastructure responsibility - and backup is one of the layers where that distinction matters most.

The problem is not that backups stop being created. The problem is that they stop being sufficient for recovery - or recovery becomes far more expensive and slow than anyone assumed.

What changes with virtualisation

In a physical environment each server is a separate entity. An agent on it backs up the file system or the database. When it fails, you restore that server.

In a virtual environment new layers appear:

the hypervisor manages multiple virtual machines on one host;
storage is shared - all VMs may sit on the same SAN or NAS;
network configuration, policies, hypervisor settings - these are also data that needs to be restored;
VM snapshots are not backups, even though they look similar.

An agent inside a virtual machine captures none of this. It sees only the file system of its own OS - the same as before. But now losing a host or the storage means losing several machines at once.

The difference between "we do backups" and "we can restore"

These are two different questions, and the answer to the first does not answer the second.

"We do backups" means data is copied somewhere on a schedule. That is necessary but not sufficient.

"We can restore" means that for a specific failure scenario - a host goes down, storage is lost, ransomware encrypts the entire data centre - we know exactly:

what needs to be restored and in what order;
how long it will take (RTO - Recovery Time Objective);
how much data we will lose (RPO - Recovery Point Objective);
who has the access and skills to perform the restoration.

In a virtual environment without revisiting these parameters, a company often discovers in practice that recovery takes not hours but days - because nobody has tried doing it from scratch.

Common traps

Snapshots instead of backups. A VM snapshot is fast and convenient, but it lives on the same storage as the original. If storage is lost, the snapshots go with it.

Agent-based backup without consistency guarantees. If the database or application has not been brought to a backup-ready state when the snapshot is taken, the backup may be inconsistent - and may not come up during restoration.

No backup of hypervisor configuration. When a host is lost, it is not enough to restore the VMs. The entire environment needs to be reconstructed: networks, policies, cluster settings. That is also data.

Restoration has never been tested. A backup that has never been tested has an unknown probability of not working. This is not an exaggeration.

What to check

A practical list of questions for a manager or IT director:

Does our backup cover hypervisor and storage configuration, not just VM data?
What happens when an entire host is lost - is there a plan and how long does restoration take?
When was the last time a full restore from backup was performed in a test environment?
Are backups stored separately from primary storage - physically or logically?
Who specifically will perform the restoration, and do they know the procedure?

If even one of these questions has no clear answer, backup exists on paper rather than as a working procedure. Virtualisation did not create this problem, but it made it sharper.

Back to all posts

Contact

What changes with virtualisation

The difference between "we do backups" and "we can restore"

Common traps

What to check

If this resonated, write to me. I reply personally.