On June 8th, 2021, a significant portion of the internet went dark, affecting high-traffic websites such as Reddit, Twitch, Spotify, The Guardian, and even the UK government’s official site. The root cause was not a malicious attack, but a software bug within the infrastructure of Fastly, a major content delivery network (CDN) that helps websites deliver content to users more quickly.
The incident highlighted the interconnectedness of modern web infrastructure, where a single point of failure can have widespread consequences. In a public statement, Fastly detailed the precise sequence of events that led to the global disruption.
The Trigger: A Dormant Bug Awakened
According to a detailed explanation by Fastly’s Senior Vice President of Engineering and Infrastructure, Nick Rockwell, the issue originated from a software update deployed on May 12th. This update introduced a latent bug that remained inactive within Fastly’s systems for weeks. The bug was finally triggered on June 8th when a single customer initiated a valid configuration change to their service. This specific, legitimate action exposed the underlying flaw and initiated a cascading failure across Fastly’s global network, causing 85% of its services to return errors.
Timeline of a Global Disruption
The outage began and was resolved with remarkable speed. The customer’s configuration change that triggered the bug was made at 09:47 UTC. By 09:48 UTC, just one minute later, Fastly’s global network began to experience widespread failures. The company’s engineers detected the disruption quickly, and by 10:27 UTC, they had identified the problematic configuration and disabled the feature responsible. Services began to recover, and by 11:00 UTC, most of the affected websites were back online. A permanent software fix for the bug was deployed later that day at 17:25 UTC. Fastly issued an apology for the incident’s impact and began a post-mortem of its processes to prevent future occurrences.