Stephane Estevez: Is the cloud safe enough to outsource HPC projects?
Thu 3 Apr 2014

It is possible to outsource a high performance computing project but does it make sense for your project? Stephane Estevez, senior product marketing manager at Quantum, investigates
Let’s be pragmatic, almost everything can be outsourced even high-performance computing (HPC) projects. But the resulting cost to meet specific requirements can be astronomical. It is possible to have a highly secure, outsourced private cloud infrastructure with a speed of light connection, but the cost or complexity may not always make sense when looking into a “do it yourself” approach.
An organisation may outsource or use the cloud because it is lacking in resources, time, knowledge or budget, but cloud is not a magic bullet. However, it can help businesses that require flexibility and a good ratio of costs/security/flexibility; it’s all about stability.
Let’s consider an example using HPC projects in the life sciences industry. Wouldn’t it be good to find a cloud storage solution that can retain raw data forever? Scientists often need to keep raw data for future investigation. It would be invaluable for them then to test multiple experimentations at the same time, without having to invest in a big computing infrastructure designed just to deal with peaks of activity (and idle the rest of the time)?
Cloud services can help by reducing the cost of storage to almost 0.01€ Gb/Month and providing flexible computing solutions when needed. It also implies a cultural change, as users don’t have to worry so much about managing their data. Scientists and researchers often sit on a treasure trove of existing data, which they can now exploit thanks to cloud adoption.
But is it secure? It depends; if you’re dealing with highly confidential or sensitive information, it may not make sense to outsource it. Another issue is that the cost to meet security requirements is becoming too large for some sectors. But cloud security isn’t any different to other outsourcing contracts, nor is it less secure than any other IT infrastructure. It’s often about finding the right balance between realistic expectations and costs, as well as IT team cultural changes and fear of losing control and visibility.
What about performance? Big data infrastructures (not only limited to analytics software such as Hadoop but to the underlying infrastructure as well) are now requiring new types of technology; we are facing the limits of current storage solutions in some cases. High density drives and RAID solutions are limited when dealing with hundreds of petabytes projects – those technologies haven’t been designed to provide a cost effective, reliable storage infrastructure for hundreds of petas. High density disks mean less reliability and when added to a RAID solution it becomes an issue, particularly affecting rebuild times.
New storage approaches (such as using Fountain Code algorithm for data storage) are now available and are deployed in the cloud. This will allow companies running Big Data projects to benefit from a cost effective storage system that stores raw data “forever” with enough reliability (using Fountain Code algorithms instead of RAID). Combining this with the cloud “elasticity” (use computing power when you need it) you can now store raw data in the cloud and run multiple experimental tests at the same time, to boost research when required.
Furthermore, backup solutions can offer disaster recovery in the cloud. A private cloud approach enables a choice of data centre location and helps to keep costs as low as possible as data sent to the cloud are duplicated. Providing companies and service providers with ‘next generation object storage’ is changing the economics of data that you can keep forever.