Features Hub

Bloomberg on Kubernetes, Chaos Engineering and Open-Source

Mon 28 Jan 2019 | Mikolaj Pawlikowski

New development practices in Bloomberg’s software dept. are landing big, as software engineer Mikolaj Pawlikowski explains to John Bensalhia

Kubernetes is headline news in Bloomberg’s software department. The Wall Street powerhouse started investigating Kubernetes about two-and-a-half years ago, after forging a tool to manage the deployment of 6,000 instances of Apache Solr – an open source enterprise search platform – across around 1,000 servers. Bloomberg then started writing containerization software and orchestration software to support it.

Mikolaj Pawlikowski, Bloomberg’s software engineer, is responsible for the construction of a microservices platform based on Kubernetes.

“Our microservices platform was born out of the observation that a lot of how our code was deployed and updated could be automated once for many services,” explains Mikolaj. “At the same time, we are operating at a scale where anything that wasn’t fully automated was costly.”

Bloomberg needed a platform that would take care of the repetitive aspects of building, deploying, scaling, and recovery of a variety of critical services.

“We decided to build it on top of Kubernetes, which was still pretty experimental at the time (three years ago),” says Mikolaj. “Since then, the speed at which Kubernetes is being improved, its momentum of adoption within the industry and the amazing community that has formed around it has reassured us that this was the right decision.”

Organised Chaos

Kubernetes operates much like other Function-as-a-Service offerings, as it handles all the actions that need to be taken between putting some code in a git repository and running it in a production-grade setting.

“There is a twist though, as we go a step further. In order to increase confidence in your new deployment, the platform allows you to run multiple versions of a service in parallel, compare the outputs and confirm that your change works as expected, and even backtest a brand new version on historic data for that service. Once you’re happy with your change, you can just switch that new version to be the one running in production.”

“This allows our developers to spend more time doing what they love – solving challenging technical problems – and less time on the deployment, roll-back, and management of the lifecycle of their services and/or applications.”

Another notable development at Bloomberg is the introduction of “chaos engineering”, which is slowly, but steadily, moving into the mainstream.

“As developers and decision-makers get increasingly comfortable with ‘controlled’ chaos, tooling is becoming available and processes are emerging,” says Mikolaj.

“There are now conferences on the topic, such as Chaos Conf, meetups in many cities, and individuals from all kinds of companies sharing their experiences.” 

Chaos Engineering: The discipline of experimenting on a software system in production in order to build confidence in the system’s capability to withstand turbulent and unexpected conditions

“Personally, I strongly believe that readily-available, high-quality, open source projects are one of the best ways to popularise a new approach like chaos engineering. A testament to that is our open source project for chaos testing of Kubernetes clusters: PowerfulSeal. It has had a very warm reception from the community and we’ve received feedback for users who have been using it to introduce their teams to the concept of chaos engineering.”

The main benefit of chaos engineering is that it allows you to catch problems that are often difficult to see without releasing in the wild. PowerfulSeal – inspired by Netflix’s Chaos Monkey, allows engineers to “break things on purpose” and observe any issues caused by the introduction of various failure modes.

“By no means is it a substitute for other ways of testing, like unit or integration tests, but rather complements them in order to find additional, often more complex, errors,” says Mikolaj.

“Another important element is that shifting the mindset of developers from ‘failure as a possibility’ to ‘failure as a necessity’ (i.e., a ‘when’, rather than an ‘if’) will prove beneficial to the overall quality of their software, especially with regards to crucial error-handling logic.”

Looking ahead to serverless computing trends for 2019, Mikolaj would like to see more open source alternatives to emerge from experiments and mature into production-grade software.

“By following open standards, these will allow for easier adoption inside of the enterprise, without needing to worry about vendor lock-in or data ownership issues, among other things.”

“Other areas in which I’m very excited to see progress include monitoring and debugging, security, and performance.”

Join me at DevOps Live.

Mikolaj is presenting at DevOps Live, taking place at the ExCeL London March 12-13th. DevOps Live and its colocated events attract over 20,000 IT industry professionals.

Experts featured:

Mikolaj Pawlikowski

Software Engineer Project Lead
Bloomberg LP


bloomberg chaos engineering kubernetes
Send us a correction Send us a news tip