The need for a holistic approach to data centre fire suppression
Mon 20 Nov 2017
Preventing a fire in a data centre, and understanding how to deal with the situation if it does happen, is one of the most critical aspects of data centre operation.
Studies into why data centres fail have revealed that the number one cause is, unsurprisingly, IT-related disasters, followed by power problems. Coming in at number three is fire.
Given how much time and money is invested in the former two issues, fire suppression gets comparatively little consideration. This seems concerning given that, once IT is excluded, and only facilities issues are taken into account, the greatest amount of downtime comes from fire events, with the average time to recovery after a blaze being 25 hours.
As well as dealing with a fire, there are a number of considerable challenges when it comes to fire maintenance disasters. Incidents when either the power has been turned off by mistake by the fire detection system or gas has been accidentally released during maintenance can both cause dramatic problems for the data centre.
Taking ownership of fire suppression
According to Barry Elliott, founder of independent data centre consultancy Capitoline, the problem that is encountered time and again is the failure of management to tie together all the cause and effect issues related to fire and facilities management. The majority of data centre builders and owners do not go nearly far enough in terms of their knowledge and appreciation of fire suppression systems.
‘Most data centre builders and owners buy a smoke detection and fire suppression system from a supplier and think that’s it,’ says Elliott. A good way to assess whether a data centre is well managed in terms of fire suppression policies is to ask those in charge the exact next step after smoke is detected.
The usual reply is simply to defer to the fire company, which is presumed to have a sensible policy. Too often, though, data centre managers don’t know what that policy is.
According to Elliott, the reality is that few operators understand the actual fire cause and effect method they have in place in their data centre. For instance, following the detection of smoke, do they know if the power and HVAC systems are turned off, and what signals are sent to the building management and fire alarm system?
After the event, there are further questions that managers are often unsure of. It is crucial they understand how to control the ventilation system in order to purge the room of gas and smoke following a fire. As well as these basic safety issues, managers need to know how to behave in the aftermath in order to return to full operation.
This includes understanding the reset methodology for all equipment that has been turned off, including how to reset the fire detection system. An apparently basic piece of information that could later save lives is knowing who orders replacement gas, and how details of the event are fed into the site incident management system.
The basics of data centre fire engineering
The core of fire engineering in a data centre is risk assessment. This assessment sets out the right location for smoke detectors and gas nozzles, and the correct selection of the fire suppressant for the environment.
It is the responsibility of data centre management to tie together all aspects of fire safety engineering
At a more fundamental level is ensuring the build includes the ability to withstand the overpressure event of a gas release. It is often the case that enclosed hot and cold aisle structures are incorporated with little consideration of how the fire suppression gas will actually get inside them.
After the risk assessment has been completed, the ’cause and effect’ algorithm comes into play. This algorithm is based on the earlier questions that data centre managers should be able to answer; effectively understanding the relationship between the detection of smoke and power, heating, ventilation and conditioning.
Finally, there should be an effective recovery plan so that fire incidents can be resolved in a few hours, as opposed to the previously quoted 25 hours. Following the basics here can be worth a significant amount of money, with any downtime of course coming with significant financial implications.
A sensible fire suppression policy then, argues Elliott, should be a holistic one. It is the responsibility of data centre management to tie together all aspects of fire safety engineering and move away from an attitude of reliance on a few simple smoke detectors.
This means, of course, increased levels of involvement from the experts. Only through better products and better understanding will data centre operators be able to improve their fire suppression policies, and the right people to provide those things are those who have made a specialism out of stopping fires in data centres.