The Stack Archive

Turning Big Data into Big Knowledge

Mon 7 Dec 2015

jeff-aaronJeff Aaron, PernixData’s Vice President of marketing, looks at turning Big Data into Big Knowledge with infrastructure analytics…

In the past, data centre management was fairly straightforward, as storage was directly mapped to individual applications.

This all changed with virtualisation.

Now, VM admins use one set of tools to manage compute, while storage admins use others to manage the underlying infrastructure. These tools provide visibility into different aspects of the data centre, but they offer little correlation or cohesiveness – especially in heterogeneous hardware environments.  As a result, strategic data centre planning has been replaced by reactive decision-making, wasting time and money.

For the first time ever, there is a solution.  You can now collect, analyse and use enormous amounts of infrastructure data to make intelligent storage design decisions that are hardware agnostic – a concept known as Infrastructure Analytics.

Picture1With Infrastructure Analytics, you can make data driven decisions as they design, deploy and operate storage. This enables you to constantly meet the changing needs of the business, and to adopt a proactive and strategic approach to the data centre. The result is lower CAPEX due to better investment decisions, and improved OPEX due to faster troubleshooting, fewer outages, and improved IT operations.

Beyond VM monitoring and reporting

There are a myriad of monitoring and troubleshooting tools in the market, the majority of which fall into two buckets.

The first category is focused on anomaly detection. These tools require you to input thresholds for metrics (e.g. latency) that generate alerts when crossed.  While these tools provide a breadth of information, they lack much depth in any specific area.

The second category is focused on reactive steps, such as migrating VMs to alleviate hot spots. These tools usually rely on third party information, such as that attained from VMware vCenter, which can make them hard to configure. Moreover, they seldom educate the end user on what the actual problem might be. Instead, they move VMs around in the hope of fixing the problem, often just hiding the problem instead.

These existing tools are very reactive, rely solely on third party data, and never educate the end user on what is going on in real-time. We are in 2015 – the era of Big Data.  We can do so much better!

Beyond Storage Deployment and Troubleshooting

Similarly, existing storage management solutions don’t cut it in today’s dynamic data centre environment.  While they provide interesting insight into the behaviour of a specific product, there are a few key things to remember.

For starters, they are element management systems, providing visibility into only a small portion of the overall infrastructure.  For example, EMC’s Unisphere product provides visibility into VMAX environments, EMC’s Symmetrix Management Console is used for Symmetrix platforms, Infosight is used for Nimble arrays, etc.  There is no visibility into other products, and little correlation between products, limiting their usefulness in heterogeneous environments.

In addition, these element management systems are limited in scope.  They are good for deployment and configuration.  Once the array is up and running they do not do much more than monitor performance – i.e. create alarms when established thresholds are crossed.  This is very reactionary and has limited value in the grand scheme of storage management and operations.

Finally, storage management systems are not context aware.  Once storage I/O leaves a host, it loses its intelligence, so storage arrays have little insight to the VMs and their specific workload characteristics.

Key requirements for Infrastructure Analytics

To proactively design and manage storage, you need to collect the right data, in the right location, and apply the right intelligence to make quick and meaningful conclusions.  To that end, below are key requirements of an infrastructure analytics solution.

Location is everything. Data needs to be collected inside the hypervisor, which is the only spot where all traffic between VMs and storage can be viewed.  In other words, the hypervisor knows exactly which I/O belongs to which virtual machine and it has insights on how storage systems use this data.  Being inside the hypervisor also enables the management solution to be hardware agnostic, working with any storage array.

Turn Big Data into Big Knowledge.  A true infrastructure analytics solution must be able to collect large volumes of data and correlate it with information from 3rd party tools (e.g. vCenter).  More intelligence can be added simply by scaling–out compute resources – e.g. adding more server media to collect more data.

Descriptive, predictive & prescriptive analytics. The first step is to know exactly what is going on in your network, and to display this data in an easy to understand fashion.  Then, a true infrastructure analytics tools can take this data and predict future trends.  With the proper recommendations engine, this data can also be used to make actual design recommendations, whereby your storage infrastructure can be constantly tuned for maximum performance and value.

Optimised user experience.  Most management tools don’t suffer from a lack of data.  Instead, they drown users in information.  For these tools to be useful, they must present data in a clear and concise way, with accurate recommendations on next steps.

Unleash the Power

With infrastructure analytics, you get detailed visibility into constantly changing workload behavior, including the amount of data being used, read/write mix, block sizes, and more.

This is an extremely powerful thing, enabling the strategic functions such as – maximise the performance of existing workloads, optimise storage for future application rollouts and upgrades, avoid storage overprovisioning costs by accurately sizing storage for performance, streamline troubleshooting by detecting VM performance problems in real-time, avoid VM performance issues with proactive design recommendations and establish a fingerprint of your system for accurate trend analysis.

Say goodbye to tactical reactive tools and hello to a holistic learning platform for strategic data centre management. With infrastructure analytics, you can design, deploy, operate and optimise data centres like never before.


Big Data Data Centre feature
Send us a correction about this article Send us a news tip