Combining IoT, gaming technology and cloud for effective real-time data centre monitoring
Fri 30 Sep 2016

Dr. Stu Redshaw, CTO at critical facilities software and sensors company EkkoSense, proposes a winning combination for improving real-time monitoring and risk management in the data centre…
We have witnessed significant improvements in data centre risk management over the last few years. Data centres today are critical IT and business assets, and interruptions to service are increasingly hard to tolerate, particularly with most centres now functioning as the heart of the extended enterprise. Downtime quite simply costs too much, so what can operators actually do to reduce their risk exposure?
While the average utilisation of cooling equipment remains very low, thermal problems such as IT equipment overheating are still the root cause of around 29% of unplanned data centre outages. This is unsustainable, both from an economic and environmental perspective. Perhaps the best way that data centre managers can manage this risk is to make sure that the right levels of expensive cold air are applied exactly where they need to be – keeping critical IT systems at the right temperature – and not anywhere else.
With the technology available today, why are we settling for anything other than real-time reporting?
There are lots of methods available for monitoring thermal efficiency and performance – and in turn reducing risk to our data centres. Until recently though the industry primarily relied on historical data reports to manage thermal risk levels – requiring in-depth post-event analysis, and failing to address issues as they emerge.
When a risk or inefficiency of over or under-cooling is at last identified, it’s fair to say that data centre teams are probably acting far too late – particularly given the time taken to generate and analyse these reports. With luck, the only real casualty from over or under-cooling, are the increased energy costs. However, things become much more serious if the thermal issue has caused a full outage – perhaps even immobilising a critical part of your business.
It’s obvious that the quicker we receive the information, the faster we can make those adjustments. Speed of access to this information has to be key to a solution. With the technology available today, why are we settling for anything other than real-time reporting?
Combining three key technologies for real-time responsiveness
By combining Internet of Things (IoT) expertise, the latest 3D visualisation gaming technology, and cloud responsiveness levels, real-time monitoring is indeed now possible and early deployments are already up and running with impressive business benefits. Should we now be considering real-time monitoring as the de facto early warning standard for thermal optimisation?
Potentially business critical issues can now be identified quickly and easily, enabling remedial action to be provided in near real-time
A more structured, end-to-end and proactive approach to the cooling challenge – one that works directly to reduce thermal risk in real-time, while at the same time helping organisations to unlock significant energy savings and release further data centre capacity, seems like a constructive development.
The latest approaches I’ve seen in action use an IoT approach that is very quick to install, provides immediate readings and can be fitted to any cooling unit in minutes allowing operators to monitor thermal instabilities right across the data centre.
It’s perhaps surprising how long our industry has tolerated simplistic tick-box data centre performance monitoring, particularly as we’re increasingly tasked with optimising the performance of some of the most business-critical data resources in the world. That’s why I’m increasingly of the view that the creation of real-time 3D representations of how data centres are performing in terms of hot and cold spots should now be part of our industry best practice.
This kind of true 3D approach really enables data centre managers to quickly recognise any potential problem areas – and then drill down, to an accuracy of a square metre, to highlight specific instances of over or under cooling.
These potentially business critical issues can now be identified quickly and easily, enabling remedial action to be provided in near real-time. Having access to this level of insight, will not only reduce data centre thermal risk significantly but also deliver business benefits in terms of better control of energy bills as well as increased capacity.
Turning complex real-time data into actionable information
The introduction of 3D software and IoT sensor-based strategy to manage the real-time state of a data centre provides operators with a holistic view of both their data centre environment’s physical and thermal dynamics. For the first time, data centre operators and owners are now able to visualise their own critical infrastructure data centre cold and hot spots in true real-time 3D.
This innovative approach clearly shows where organisations are either under-cooling or over-cooling now – not yesterday, last week or last month – in real-time. Being able to identify hot and cold spots in real-time also means that you can move quickly to resolve issues – not just reducing thermal risk, but also ensuring a direct bottom line benefit through energy cost savings for the business.
This kind of real-time thermal visualisation is set to provide a step-change improvement in terms of data centre risk management. I’ve been talking about this approach in the wider public domain for a few months now and what’s clear is that there is a real enthusiasm for the ability to visualise data centre risk real-time in 3D – contrasting completely with traditional monitoring approaches. Early deployments of this 3D real-time model across a number of consultancy engagements have already resulted in impressive results. One leading European data centre alone has achieved a 32% reduction in cooling energy! Watch this space.