Disaster recovery: Five things to consider if you’re relying on virtual replication
Fri 24 Oct 2014
Jules Taplin breaks down the five key points to observe when you’re relying on replication to recover from a data disaster
In the virtualised world, an attractive option for disaster recovery is replication. Your virtual machines are replicated to another virtual environment so should you need them you just start them up and they work.
The benefits are that recovery is just a click of the button away, and your replicated VM’s just need to be booted up. Recovery time is therefore minutes, not hours because the data is already in a useable format. You can achieve good RPO’s and it’s a scalable solution. Replication is increasingly being used where synchronous mirroring is not a viable solution due to distance. Sounds great? Well it’s supposed to but let’s delve a little deeper and uncover the limitations of virtual replication.
You’re at risk if your live system is corrupt. Virtual replication offers a good solution providing your live system is working. But it only deals with a certain class of failure. If you think about what you really need to protect your IT system against, it’s often the risks you haven’t thought about. Consider if your database is corrupt, your file system has been encrypted, or your architectures have been deleted from within. When you try to failover then your virtual replication will encounter the same problems because your live system has been replicated. What do you do then?
Reliability is compromised. Because your replicated VM’s aren’t booted and tested regularly, virtual replication isn’t as reliable as other disaster recovery methods. Unless every replication is tested you won’t be sure that your failover will work. The limitations outlined in point 1 further compromise the certainty that your replicated system will work. Do you know when your ‘last known good’ copy of your data was taken?
Recovery times aren’t guaranteed. Because of the main limitation of virtual replication – a lack of reliability, recovery times cannot be guaranteed. If you’re in the situation where your replica system isn’t working, then your only option is to recover from previous backups or replications and then test the recoveries once they’ve been executed. This carries with it the issue of time, taking longer to recover your IT system, whilst trying to find the best replication to recovery from and slowing down the time for full productivity to resume.
Recovery is managed by you. In the event of an IT outage, you will be the one performing the recovery. One of the most stressful situations for your business, one that can mean survival or failure is of the business is under the responsibility of the one person. If you don’t experience regular IT failures then you won’t be rehearsed at the recovery process. Practice and testing has a biggest influence on how quickly your IT systems can be recovered, but with replication, testing and rehearsing the whole recovery process is not regular enough. This leaves you at risk of recovery time being prolonged while you work through the recovery process on your own.
Cost. Finally, the cost of replication can be high. Compared to say, modern Pre-recovery methods, which offer similar benefits and should be weighed up in comparison, virtual replication tends to lose its competitive advantage.
So if you’re thinking that replicating your virtual machines to separate data centres will solve all your IT availability problems, then you may need to think again. Take the time to consider what you want to protect yourself against and you might find you need to look a bit further if certainty and reliability is important for you.
A company that experienced an internal threat and found that some of their architectures were deleted from the inside went through a successful recovery process, only to then have further deletions happen following this. With virtual replication, the challenge would be working through historic copies to understand how to retain the most up-to-date data, whilst ensuring you’re not recovering an impacted replication. Without the regular testing of each replication it will take time to identify the best copy to recover. As it was, with Pre-recovery, this client had a safe copy already recovered to a separate virtual machine so they had 100% certainty of recovery within just a few minutes – and at a cheaper price too.