Features Hub

How the cloud enables IoT data to become the beating heart of your business

Mon 9 Sep 2019 | Simon Field

AI data centre

To cope with the influx of IoT data, organisations should turn to reactive and responsive big data infrastructures powered by the cloud, writes Snowflake’s Simon Field

The very notion of the data lake has evolved significantly in the last decade, driven largely by innovations in cloud technologies. Back then Hadoop was king of the nascent ‘Big Data’ trend and the challenges and complexities of deriving value from huge data sets were not well understood. Fortunately, the cloud has since produced better, faster and more secure ways to make data-driven decisions, which data lakes have been integral to.

However, a growing modern challenge has emerged – the amount of data organisations must now cope with. In particular, the surge and popularity of IoT is fast outgrowing the ability of organisations to accurately analyse all the data that is being gathered. And all signs show that that the growth in devices and sensors of IoT technologies entering data lakes hold no signs of abating. IDC estimates that there will be 41.6 billion connected IoT devices, or “things,” generating 79.4 zettabytes (ZB) of data in 2025.

Analysing data is central to IoT technologies. Without the ability to accurately analyse IoT data, organisations will be unable to harness the true potential of IoT technologies and discover new opportunities for transformative business growth. Organisations must find the time to understand and analyse the data to build more efficient technology models to cope with these modern data demands.

The data conundrum

IoT is still relatively new and the data that arises from this will require a deeper understanding from organisations. IoT data that comes from the sensors within devices is continuous in nature, resulting in a constant stream of real-time data, which is driven faster thanks to advances in edge computing. Everything from smart city technologies to wearables will proliferate the number of data touch points, driving even larger data volumes. This is only set to increase as IoT technologies become more readily available.

To cope with this influx of data, organisations will need to adopt reactive and responsive big data infrastructures, powered by the cloud. Organisations can no longer rely on legacy, on premises, big data platforms, which are ill equipped to cope with vast volumes of data, with huge variability in workload demands.

Additionally, data lake initiatives have historically failed, in part, because of immense complexity, a lack of security and governance, and a proliferation of data silos. Even when they don’t fail completely, there are additional tradeoffs with storage capacity, scaling and the expense.

“Organisations can no longer rely on legacy, on premises, big data platforms, which are ill equipped to cope with vast volumes of data”

The business case for using cloud-native big data platforms falls in-line with organisation’s wider digital transformation strategies, for which the increase in IoT technologies also forms a large part of. By capitalising on this, business intelligence (BI) and IT teams will benefit from increased business agility, improving customer experiences and the discovery of new business opportunities.

The cloud is also helping data gathering become more cost-effective, both in terms of storage and compute. From a business perspective this is of course ideal, but a key challenge that remains is the resource and time required to analyse these vast troves of data insights. Businesses are still adopting and transitioning to an IoT world, and therefore will require an adjustment period for the BI and IT teams to define IoT data, and place processes to streamline data capture and analysis.

Data lake or data warehouse?

When taking a broader consideration of cloud-powered data platforms, and ensuring organisations are equipped to make better use of IoT data, both data lakes and data warehouses can be used. While these terms are often grouped together, each has different qualities that can better suit individual business demands.

Data lakes are better for organisations who prefer to freely accumulate raw data from a variety of sources, without specifically defining which data they’d prefer being ingested. Data lakes can be a useful storage solution for these models, as patterns can be triggered as new models are injected.

For example, if a piece of equipment is about to fail, the data lake can potentially identify and pinpoint the exact location from raw data, as opposed to saying there is a general model fault. For those who have more of a focus on what they want to see from data and more defined business goals, data warehouses are valuable in this respect.

But a trend becoming more popular is using a cloud data warehouse as the data lake or even data ‘ocean’. Depending on the data warehouse, the benefits could extend to ingesting all of the raw data in a single location, bypassing intermediate technologies, whilst achieving low-latency relational analytics, and obtaining virtually unlimited, multi-workgroup concurrency scaling.

With IoT data being ingested so rapidly, these cloud powered platforms can ingest and analyse datasets in near real time, drawing immediate insights and action. This is helping alleviate the pressure of BI and IT teams of gathering this data themselves.

Automation efficiency

With your data platform chosen and IoT data actively being collected in real-time, BI and IT teams will benefit from adding a layer of automation through machine learning models.

As more and more data continues to enter data lakes or data warehouses, machine learning models are able to sift through data for complex patterns and derive valuable insights for the organisation based on their business needs. These models can be monitored, tested and modified to better meet business objectives as patterns in the data change over time through positive feedback and enhancements to upstream processes and applications.

On top of this, many analytics are batch driven and create reports on a wider scale, giving you updates on the entire device network. In the future, automation will help reports become more specific, telling the analyst automatically what within the report requires their attention.

The beauty of these automated processes is that they help reduce manual labour on time-intensive processes, such as gathering data, freeing up teams to focus on actually analysing pertinent data and respond faster than previously possible. With the fast pace innovation of IoT, automation is creating a balance where organisations are now able to stay ahead of the curve, and keep pace with the plethora of data entering the system.

The popularity of IoT will only see increased data volumes, requiring businesses to invest the time to accurately monitor and analyse trends to get the most out of their IoT infrastructure. Failure to do so will leave organisations with a large vacuum of untapped data and a blind view of IoT operations. For organisations, the long-term held mantra of never wanting to throw away data will be replaced by the idea that all data will be analysed and actioned. By combining automation to your cloud data lakes or data warehouses, IoT data will become the beating heart for organisations.

Experts featured:

Simon Field



data warehouse
Send us a correction Send us a news tip