AWS suffers three major outages in a single month
Written by Nicole Cappella Tue 4 Jan 2022

On Wednesday, December 22, Amazon Web Services (AWS) experienced a power outage that affected services for enterprise clients such as Slack, Epic Games, and Imgur. Amazon blamed the interruption of service on a power issue in an East Coast data centre, affecting one Availability Zone in the region.
The outage occurred in the early morning hours, and the majority of services were restored by early afternoon. According to the AWS status page, “By 2:30 PM PST, we recovered the vast majority of EC2 instances and EBS volumes. However, some of the affected EC2 instances and EBS volumes were running on hardware that has been affected by the loss of power and is not recoverable.”
Service interruptions in the U.S. East Coast Availability Zones have a noticeable, widespread ripple effect, as many large enterprises utilize AWS data centre services in those zones. During three December outages, customers may have experienced problems with critical work from home services, like Slack; streaming through Netflix and Hulu; and gaming through PlayStation, Nintendo and Xbox.
One of the most critical interruptions, though, may have been the one that occurred on 7 December – affecting Amazon’s internal applications and services. 2021 saw a record-breaking Black Friday followed by an increase in capacity at the company, including the hiring of 150,000 seasonal workers, and a 50% increase in ports of entry to the U.S.
During the insanely busy holiday season, Amazon employees were unable to access the company’s internal warehousing and delivery applications due to AWS outages. According to the company’s analysis of the event, an automated capacity scaling event “triggered an unexpected behaviour” inside the network. This resulted in a surge in activity that overwhelmed the connection between the internal network and the main AWS network, increasing latency and causing even more congestion.
The second outage, just a week later, involved two West Coast Availability Zones and affected major gaming platforms as well as Twitch, the live streaming platform owned by Amazon. Ed Skoudis, president of the SANS Technology Institute noted that the interdependence of cloud infrastructure means that an outage can impact users across different industries, geographical regions, and applications.
“A single glitch in a high-profile provider will have huge implications on countless organizations of all sizes, in often very unexpected ways.”
“Service interruptions are vast and impact thousands of companies and millions of users. We are putting more eggs into fewer and fewer baskets. More eggs get broken that way.
Written by Nicole Cappella Tue 4 Jan 2022