7 things businesses get wrong about disaster recovery
Thu 23 Jun 2016
By Joseph George, Vice President, Product Management, Sungard Availability Services
Major data losses can cripple a business. According to the Disaster Recovery Preparedness Council, 73 percent of companies worldwide are failing in terms of disaster readiness. Reported losses from a disaster ranged from a few thousand dollars to millions of dollars with nearly 20 percent indicating losses of more than $50,000. One recent outage could cost a major SaaS provider upwards of $20 million. An Aberdeen and TechTarget survey found that only six percent of medium-sized companies recover long-term from major data loss. Obviously while you can’t attribute those failures solely to data loss, it is striking that 94 percent of companies will never recover long–term from such a loss.
One reason is that too few businesses take a proactive approach to disaster recovery (DR). Too many simply sign up for DR and assume they’re protected. That couldn’t be further from the truth.
There are a number of misconceptions about DR, and only those who truly understand the elements of a successful recovery can get back up and running without missing a step.
Here are seven things businesses typically get wrong about DR and here are some suggestions on how to address them.
1. They think recovery is simply about data and systems
This leads to an all-or-nothing approach in which organizations treat all applications as equally important. In reality, some apps are more important than others. Every minute that your customer-facing transaction processing application is down could cost you thousands of dollars. However your back-end HR applications, for example, could potentially be down for a week without major business impact.
Make your recovery more resilient and optimize your recovery dollars by tiering your applications in terms of business impact and setting recovery options based on where they fall in the spectrum. For example, high-availability applications may be architected to fail over to another live instance with all data replicated. At the other end of the spectrum, backups that restore data within a few days might be good enough for less business-critical applications.
2. They don’t understand application dependencies
Tiering applications is a solid first step, but what many organizations may not know is how those applications map to their underlying systems and infrastructure. Without knowing that, it’s difficult to understand which systems and components you need to recover to bring a specific application back up.
Some large companies can’t even say for sure how many business applications they use, let alone how they map to the system. As a business evolves and IT becomes more critical, the applications and mappings become much more complex. If you don’t use the right tools to discover and map applications and their dependencies, you’ll have a false sense of security about your readiness for a disaster.
One company tested its DR and was able to recover 99 percent of its environment successfully. All its servers were successfully restored. Then users went to log on and couldn’t. The problem was that the company recovered everything except their Active Directory, which was a critical component in order for applications to be functional. If you don’t have dependencies mapped then you may not be able to recover your business applications effectively.
3. They don’t protect data properly
The importance of backing up data is highlighted by the case of a financial institution that diligently backed up its data on tapes. When they went to recover that data, however, they discovered half their tapes were blank. There was an issue with their backup system and data wasn’t being added to the tapes. Luckily this was discovered during a test and not an actual disaster, but the lesson is clear.
We see similar situations all the time: For businesses that manage their own data protection and backups, backup issues are the cause for recovery failures 40 percent of the time from our experience. We can all identify with the importance of protecting data. Think about it – how much data on your laptop will you lose if it crashes right now? When’s the last time you backed it up?
Similarly, how much data could your business afford to lose in the event of a disaster? None at all? 10 minutes’ worth? Two days? The answers will determine what technology you need. Going back to the first point, you need to identify your business-critical applications. For critical information like customer transactions, losing even a small amount of data is not an option. Also, identify where your backups are actually being stored and whether they’re safe. If your data center is right across the street, that is sufficient in the event of a hardware failure, but not in the case of a hurricane.
4. They think their backups alone will protect them
If your laptop is fully backed up and you lose it, you can get a new one, restore your data and be back up and running in a couple of hours (did I mention, assuming your data was fully and recently backed up?). But across an enterprise, it’s a different story – there are different kinds of compute infrastructure. In a major disaster, even if you have done a good job protecting all of your data, you also need the infrastructure and procedures to put it all back together.
Some companies plan to recover to their test and development infrastructure in case of a disaster. On the surface, that seems to make sense. But how old is that equipment? Has it been updated to keep up with changes in your production? Would your production environment run effectively on test equipment post-recovery?
5. They forget about people, process, and governance.
Many companies see DR as a technology issue – run these procedures and you’ll always get the same outcome. But that’s not always true.
Do you have your process defined? Does the team responsible for the recovery have the right experience? Is recovery almost second nature to them or is it something they’ll be doing for the first time?
Everyone needs to know their role. That’s the only way a recovery will be successful. Beyond having the right people and experience, you’ll need people who are going to be available in the wake of a disaster. During hurricanes, for example, you can have the data, procedures, experience, and all the other ingredients, but if your employees are not available or cannot get access to the recovery data center, none of that matters.
6. They don’t test
The only way to know for sure that you’re DR ready is to test. Yet it’s something most businesses ignore. They have the right components, they have the right people with the right experience, but if the last test was three years ago, you might not be ready now.
There are two aspects to testing: One is frequency of testing. Some businesses feel like they’re protected because they have the equipment and they’re confident they can figure it out. At that point, it’s too late. The gaps and issues that a test will reveal can’t be fixed after a disaster strikes. Organizations that are ready for a disaster typically test twice a year, and more frequently for mission-critical applications.
The second aspect is that your system is always changing. The rate of change has increased, especially over the last few years, demanding that processes keep up with those changes and ensure regular testing. Your DR readiness is only as good as your last test.
But don’t cheat. Too often, testing is driven by compliance and may be preceded by a lot of preparation / shortcuts or even reduced scope in order to pass the test. The test is successful and everyone is happy, but will you really know you are ready to face a disaster if you treated the test differently?
7. They don’t understand the different types of risk
In the past, “disaster” typically meant a hurricane or fire. Today there are infrastructure failures, security breaches, malware, and other threats that are no different in terms of potential to cause data loss and application downtime. This means that disasters can occur at any time in a variety of ways.
If hackers access your system and data, that’s bad enough, but if they start propagating malware or viruses and deleting or corrupting your data, that’s just as bad, if not worse, than a hurricane. You have no warning, and sometimes you might not even know about it until it’s too late.
In that scenario, is your data safe offsite somewhere? Is it protected from a network perspective, so hackers can’t access the DR copy of your data? Do you have multiple copies so that you can access an uncorrupted copy before the breach? The nature of disasters has changed, and familiarizing yourself with the first six points will help you prepare for whatever crosses your path.
Are you ready for a disaster?
Take a close look at your DR plan. Have you defined your most critical applications and how they map to your infrastructure? Do you have the right backups in place for the right data, and the procedures for recovering it? Are your people, process, and governance all experienced, available, and ready to act? Do you test frequently to uncover any weaknesses in your plan? Are you prepared for any kind of disaster, whether a storm, a hack, or equipment failure?
If not, now is the time to get started.