What caused Facebook's 6-hour outage?

In October 2021, a routine maintenance command accidentally withdrew Facebook's BGP routes, making all Facebook, Instagram, and WhatsApp servers unreachable from the internet. The outage lasted over 6 hours because engineers could not remotely access systems to fix the issue and had to physically enter data centers.

How can I check if a website is down?

Use services like DownDetector, IsItDownRightNow, or the 'Down for Everyone or Just Me' website. You can also try accessing the site from a different network (like switching from Wi-Fi to mobile data) to determine if the issue is on your end or the website's.

What is the most common cause of website outages?

The most common causes are traffic spikes that exceed server capacity, configuration errors during deployments, DDoS attacks, DNS failures, and expired TLS certificates. Many major outages are caused by human error during routine maintenance rather than external attacks.

Why Websites Go Down - Lessons from Facebook, AWS, and Google Outages

Even Famous Services Have Gone Down

"The server crashed," "The site went down" - you hear these phrases in the news all the time, but what's actually happening? Even the world's largest services can't avoid outages. And the causes are often surprisingly mundane.

Notable Service Outage Incidents

Facebook's 6-Hour Outage (October 2021)

Facebook, Instagram, and WhatsApp went completely offline for approximately 6 hours. The cause was a BGP configuration error. When a Facebook engineer modified routing settings, they accidentally withdrew all of the company's BGP routes.

As a result, Facebook's network "vanished" from the internet, and DNS name resolution also stopped working. Furthermore, since all internal tools were also hosted on Facebook's network, engineers lost the very means to access and fix the problem. Ultimately, they had to physically travel to the data center and manually restore the servers.

AWS Major Outage (February 2017)

Amazon Web Services' S3 (storage service) went down for approximately 4 hours, affecting numerous services including Netflix, Slack, and Trello. The cause was an engineer who mistyped a command during debugging, shutting down more servers than intended.

The ironic aspect of this outage was that AWS's own status page was hosted on S3, so they couldn't display outage information.

Cloudflare Outage (June 2022)

Cloudflare, a CDN service used by approximately 20% of the world's websites, experienced an outage affecting numerous services including Discord, Shopify, and Fitbit. The cause was a network configuration change that triggered an unexpected chain reaction.

Google's 47-Minute Total Service Outage (December 2020)

Nearly all Google services - Gmail, YouTube, Google Drive, Google Maps, and more - went down for approximately 47 minutes. The cause was the authentication system's storage running out of capacity. Every service requiring login was affected.

Main Reasons Websites Go Down

Traffic spikes (overload): Access surges from popular ticket sales, sale launches, or breaking news exceed the server's processing capacity
Configuration errors (human error): Outages caused by engineer mistakes are extremely common. Both the Facebook BGP incident and the AWS S3 incident were human errors
Software bugs: Bugs included in updates are discovered in the production environment
DDoS attacks: Attacks that intentionally flood servers with massive amounts of traffic to bring them down
DNS failures: When DNS breaks, users can't access the site even though the server itself is functioning normally
Certificate expiration: When an HTTPS certificate expires, browsers display warnings and block access
Physical failures: Data center power outages, cooling system failures, undersea cable cuts

How to Check If a Site Is "Down"

When you can't access a site, there are ways to determine whether it's your connection or the site's problem.

Down Detector: A site that aggregates outage reports from users worldwide in real time
isitdown.site: A site that checks whether a specified URL is accessible from various locations around the world
IP Check-san: First verify that your own internet connection is working. If your IP address is displayed, your connection is fine
Try from a different device or network: Try accessing via your smartphone's mobile data connection

Summary

Even the world's largest services can go down for hours due to a single configuration mistake. The causes of website outages range from traffic spikes and human error to DDoS attacks and DNS failures. Next time you can't access a site, first check your connection on IP Check-san, then check Down Detector for site-side outage information.

Even Famous Services Have Gone Down

Notable Service Outage Incidents

Facebook's 6-Hour Outage (October 2021)

AWS Major Outage (February 2017)

Cloudflare Outage (June 2022)

Google's 47-Minute Total Service Outage (December 2020)

Main Reasons Websites Go Down

How to Check If a Site Is "Down"

Summary

Related Terms in This Article

Related Articles

What Is Social Engineering? Cyber Attacks That Exploit Human Psychology

What Is a DNS Leak? Risks and Prevention When Using a VPN

What Is DNS over HTTPS (DoH)? How It Works and How to Set It Up

How HTTPS and TLS Work: The Encryption Behind Secure Communication

Email Security Basics: How to Protect Yourself from Phishing

What Is Do Not Track (DNT)? How to Set It Up and Its Real Effectiveness