On June 8, 2021, numerous high-traffic websites across the globe suddenly became inaccessible. The widespread disruption was traced back to a massive outage at the content delivery network (CDN) provider Fastly. The incident highlighted how a single point of failure in critical internet infrastructure can have cascading effects worldwide.
The Dormant Bug and Its Trigger
The root cause of the global outage was a software bug introduced in a Fastly software push on May 12. The bug remained latent and undiscovered within the system for weeks. The issue was not activated until June 8, when a single, unnamed Fastly customer made a specific change to their settings. This valid customer configuration change triggered the dormant bug, leading to a system-wide failure.
Widespread Impact and Rapid Recovery
The activated bug immediately took down 85% of Fastly’s network. This resulted in service disruptions for major online destinations, including The New York Times, The Guardian, Reddit, and Twitch, among many others. Fastly’s engineering team detected the problem within one minute of the outage beginning. The company reported that it had restored 95% of its network services within 49 minutes. Following the event, Fastly issued an apology and began a detailed post-mortem process to analyze the failure.