Amazon’s cloud unit back to normal after global outage

Greg Bensinger and Shubham Kalia and Deborah Mary Sophia |

AWS from Amazon competes with Google’s and Microsoft’s cloud services.
AWS from Amazon competes with Google’s and Microsoft’s cloud services.

Amazon.com says its cloud service has returned to normal operations after an internet outage that caused global turmoil among thousands of sites, including some of the web’s most popular apps like Snapchat and Reddit.

In an update the company said the underlying cause of the outage had been fixed, while noting there were still connectivity issues on some AWS services.

Amazon added that some AWS services had a backlog of messages that would take a few hours to process.

The disruption knocked workers from London to Tokyo offline and halted others from conducting normal everyday tasks like paying hairdressers or changing their airline tickets.

It was the largest internet disruption since last year’s CrowdStrike malfunction hobbled technology systems in hospitals, banks and airports, highlighting the vulnerability of the world’s interconnected technologies. It was at least the third time in five years that AWS’s northern Virginia cluster, known as US-EAST-1, contributed to a major internet meltdown.

A Starbucks mobile app
AWS says its technical problems have been fixed. (AP PHOTO)

Amazon did not address a request for more clarity about why that particular data centre keeps being affected, instead pointing to an online statement that said the matter had been “fully mitigated.”

The problems stemmed from what is known as the Domain Name System, or DNS, which prevented applications from finding the correct address for AWS’s DynamoDB API, a cloud database relied upon to store user information and other critical data.

Earlier, AWS said the root cause of the outage was an underlying subsystem that monitors the health of its network load balancers used to distribute traffic across several servers.

The issue, AWS said, originated from within the “EC2 internal network.”

EC2 refers to Amazon’s Elastic Compute Cloud service, which provides on-demand cloud capacity within AWS. Businesses use EC2 to run virtual servers to develop, launch and host applications.

A data centre in Virginia
The outage originated at a northern Virginia data centre. (AP PHOTO)

Amazon later said “all AWS services returned to normal operations. Some services such as AWS Config, Redshift, and Connect continue to have a backlog of messages.”

AWS provides computing power, data storage and other digital services to companies, governments and individuals and is the world’s largest cloud provider, followed by Microsoft’s Azure and Alphabet’s Google Cloud.

Disruptions to its servers can cause outages across websites and platforms – ranging from food delivery apps to gaming platforms and airline systems – that rely on its cloud infrastructure. AWS said on its status page that Monday’s outage originated at its US-EAST-1 location in northern Virginia, its oldest and largest for web services. The site suffered outages in 2021 and 2020.

According to documentation on the AWS website, the US-EAST-1 site is often the default region for many AWS services.

The problem highlights how interconnected everyday digital services have become and their reliance on a small number of global cloud providers, with one glitch wreaking havoc on business and day-to-day life, experts and academics said.

“This outage once again highlights the dependency we have on relatively fragile infrastructures,” said Jake Moore, global cybersecurity adviser at European cybersecurity firm ESET.

Snapchat on a mobile phone
Snapchat users were hit by the huge outage. (Paul Braven/AAP PHOTOS)

Snapchat last had over 7500 reports on Downdetector, lower than the peak of more than 22,000 but still higher than the 4000 outage instances at around 10pm on Monday AEDT.

Amazon’s own services, including its shopping website, Prime Video and Alexa, were also hit, although Downdetector last showed a decrease in severity.

Fortnite, Clash Royale and Clash of Clans were among the gaming platforms affected. Uber rival Lyft was also knocked down in the United States.

In Britain, Lloyd Bank, Bank of Scotland and telecom service providers Vodafone and BT were all hit, according to Downdetector’s UK website, as was UK tax, payments and customs authority HMRC’s website.

Reuters