The Modern Data Warehouse Has Arrived
Tue 27 Apr 2021 | Paul Watson-Gover
Modern data warehouses are helping organisations create highly effective strategies that optimise tool selection and data migration and foster coordination
Until relatively recently, building a data warehouse meant making an almost binary choice between business agility and data quality, because for many organisations, it was impractical to achieve both. The problem was rooted in the huge expense required to establish an on-premises data warehouse, and the result was that few companies could afford the investment required.
The challenges didn’t end there – organisations also faced a range of implementation issues. For example, IT teams had to plan for how much computer and storage power their data warehouse would require. But if those calculations were wrong and they bought too little or too much hardware, capacity could be insufficient or significant sums might be wasted on unused memory or other costly resources.
This scenario will no doubt be familiar to anyone with experience of data warehouse design. Unfortunately, the scale of the challenge facing data warehouse development projects is only set to increase. IDC estimates that by 2025 data creation will grow to ten times the amount of data produced in 2017 to reach a massive 163 zettabytes.
The net result of these challenges and trends is that data warehouse development needs to modernise. In doing so, organisations should consider a range of objectives, from realigning with current business goals and provisioning data for existing and future business cases, to leveraging new platforms and data-driven tools. Modernisation strategies should also adopt current data management best practices to more effectively develop data warehouse teams and skills. What’s key in plans for modernisation is more focus on the need to balance the competing needs of scale, speed, functionality and agility.
Modern problems require modern solutions
Central to the process is automation, as it delivers the flexibility, control and performance required for data warehouse design, development and administration. With an automation-led approach guiding development, data warehouses leave behind outdated methodology and practices. In doing so, it becomes possible to address the shortcomings of traditional approaches where productivity, flexibility, reuse and adherence to standards are much more restricted.
Adding data warehouse automation software can deliver wider benefits and value in a much faster timeframe than traditional hand-coding or using native tools without automation. In fact, it can simplify development to minimise both effort and risk in data integration and infrastructure projects. This allows companies to focus their effort and resources on providing analytic value to the businesses.
In order to cope with growing data volumes, the key infrastructure enabler is the adoption of a cloud data warehouse database hosted online by a public cloud provider. It has the functionality of an on-premises database but is managed by a third party, can be accessed remotely and its memory and compute power can be increased or reduced instantly according to need.
This is in complete contrast to legacy approaches in that cloud-based services have empowered organisations to pay only for the resources they use. As such, data warehouses built on cloud infrastructure increasingly form the basis of modernisation efforts in businesses committed to data-driven excellence.
Cloud data warehousing brings additional benefits to analytical data infrastructures, from agility and cost effectiveness to scalability and performance. And specifically, building a cloud-based data warehouse powered by automation tools enables end users to design or prototype new analytic components without having to spend large sums of money on infrastructure. This fast-tracks new projects and increases development and operation capabilities.
In practical terms, choosing a cloud data warehouse solution should begin with a cost analysis to estimate how much money it could save. For instance, different cloud providers have different pricing regimes, with more established names, such as AWS, Microsoft Azure and Google Cloud, renting out nodes and clusters so every user has a defined section of the server. This makes pricing predictable and constant, but can be a disadvantage in that these shared servers sometimes require maintenance and planned downtime.
Each cloud provider also has its own suite of supporting tools for functions such as data management, visualisation and predictive analytics, so these choices should be evaluated when deciding which provider to use.
Other modern cloud-based data warehouse platforms, such as Snowflake, provide elastic compute functionality allowing users to easily adjust where and how resources, such as costs, are being used. The ultimate goal is of using these services to make the design, development, deployment and operation of data warehouses quicker and cheaper, so teams can deliver projects in much tighter timescales than previously possible.
Modern data warehouses are helping organisations to create highly effective strategies that not only optimise the selection of tools and data migration processes, but build effective coordination between teams and stakeholders.
By understanding the technology and infrastructure options now readily available, organisations can have a transformational effect on their ability to manage the influx of big data, automate manual processes and maximise their return on investment