wordpress blog stats
Connect with us

Hi, what are you looking for?

, , , ,

Here’s why Facebook, WhatsApp, and Instagram all went down for nearly six hours

Although the outage has to do with DNS, the underlying issue can be traced back to something called Border Gateway Protocol.

Facebook’s family of products — including Instagram, WhatsApp, and Messenger —  started coming back online early Tuesday morning (IST) after being down for nearly six hours in an unprecedented outage affecting billions of users.

The outage which started around 9 pm IST on Monday not only affected Facebook’s own products and users but also websites and apps that use Facebook services like ads and authentication (Login with Facebook).

According to outage-tracking site Downdetector, this was the largest outage the company saw with over 14 million problem reports from all over the globe.

What caused the outage?

Facebook is yet to publish a detailed post on what went wrong but in a short blog post, the company said that the root cause of this outage was a faulty configuration change on the backbone routers that coordinate network traffic between Facebook data centers. “This disruption to network traffic had a cascading effect on the way our data centers communicate, bringing our services to a halt,” the company said. The company also said that there is no evidence that user data was compromised as a result of this downtime.

In a more detailed explanation of what went wrong, CDN provider Cloudflare explained that the outage can be traced back to an issue with something called the BGP or Border Gateway Protocol, which is a mechanism to exchange routing information between different networks on the internet.

Advertisement. Scroll to continue reading.

“The Internet is literally a network of networks, and it’s bound together by BGP. BGP allows one network (say Facebook) to advertise its presence to other networks that form the Internet,” Cloudflare explained. “Without BGP, the Internet routers wouldn’t know what to do, and the Internet wouldn’t work.”

Facebook services went down because the company’s services stopped advertising its presence and did not allow ISPs and other networks to find Facebook’s network, Cloudflare wrote.

Well, did DNS have anything to do with this outage?

An issue with the DNS or Domain Name System is common to most internet outages, and this outage is no different. This system basically converts human-readable addresses such as facebook.com into machine-readable IP addresses where these websites actually live. DNS resolvers grab this IP address from the domain name servers, typically hosted by the entity that owns it.

In this specific case, Facebook withdrew its BGP route (because of the faulty configuration) that contained the IP addresses of its DNS name servers. As a consequence, DNS resolvers around the globe stopped resolving Facebook domain names and anyone trying to access the site did not know where to go, Cloudflare said.

“In simpler terms, sometime this morning Facebook took away the map telling the world’s computers how to find its various online properties. As a result, when one types Facebook.com into a web browser, the browser has no idea where to find Facebook.com, and so returns an error page.” – Doug Madory, director of internet analysis at Kentik

“But that’s not all. Now human behavior and application logic kicks in and causes another exponential effect. A tsunami of additional DNS traffic follows,” Cloudflare further wrote. “This happened in part because apps won’t accept an error for an answer and start retrying, sometimes aggressively, and in part because end-users also won’t take an error for an answer and start reloading the pages, or killing and relaunching their apps, sometimes also aggressively.” As DNS resolvers started getting overwhelmed, Facebook’s failure started causing unintended side-effects to the rest of the internet.

Advertisement. Scroll to continue reading.

Here is a more technical explanation of what went wrong and here’s one in layman terms.

What took Facebook so long to fix the issue?

Outages that bring down large swathes of the internet are uncommon but they still happen. In July, an outage at content delivery network (CDN) provider Akamai, affected popular sites like Amazon, Airbnb, Swiggy, Microsoft, Paytm, and Times of India. In June, Fastly, another CDN, took a hit that affected Reddit, Spotify, Shopify, The New York Times, BBC, among others. But in both these cases, the outage lasted for about an hour, unlike the Facebook outage which took the company’s engineers nearly six hours to fix.

Notably, the outage cut off Facebook employees from internal communication tools and physical access to building sites, severely hindering the resolution process.

Renewed calls for the breakup of Facebook

“Maybe one billionaire with a penchant for destroying democracies shouldn’t be allowed to own so much of the internet and maybe that’s why antitrust laws exist that officials who do not take lobbyist money from said billionaire-owned interests should enforce,” US Congresswoman Alexandria Ocasio-Cortez said on Instagram.

“London-based internet monitoring firm Netblocks noted that Facebook’s plans to merge its platforms — announced in 2019 — had raised concerns about the risks of such a move. While such centralization “gives the company a unified view of users’ internet usage habits,” it also makes the services vulnerable to single points of failure, Netblocks said.” — AP News

Funnily, the outage, which affected over 3 billion users, came on the same day that Facebook asked a federal judge to dismiss an antitrust complaint by the Federal Trade Commission because it faces vigorous competition from other services.

Advertisement. Scroll to continue reading.

Twitter has a field day

Meanwhile, Twitter had a hell of a day as it was the only major social media platform still functioning.

Also Read:

Advertisement. Scroll to continue reading.

Have something to add? Post your comment and gift someone a MediaNama subscription.

Written By

MediaNama’s mission is to help build a digital ecosystem which is open, fair, global and competitive.

Views

News

India and US come to terms on how to deal with the equalisation levy in light of the impending Global Tax Deal.

News

Find out how people’s health data is understood to have value and who can benefit from that value.

News

The US and other countries' retreat from a laissez-faire approach to regulating markets presents India with a rare opportunity.

News

When news that Walmart would soon accept cryptocurrency turned out to be fake, it also became a teachable moment.

News

The DSCI's guidelines are patient-centric and act as a data privacy roadmap for healthcare service providers.

You May Also Like

News

Google has released a Google Travel Trends Report which states that branded budget hotel search queries grew 179% year over year (YOY) in India, in...

Advert

135 job openings in over 60 companies are listed at our free Digital and Mobile Job Board: If you’re looking for a job, or...

News

Rajesh Kumar* doesn’t have many enemies in life. But, Uber, for which he drives a cab everyday, is starting to look like one, he...

News

By Aroon Deep and Aditya Chunduru You’re reading it here first: Twitter has complied with government requests to censor 52 tweets that mostly criticised...

MediaNama is the premier source of information and analysis on Technology Policy in India. More about MediaNama, and contact information, here.

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ

Subscribe to our daily newsletter
Name:*
Your email address:*
*
Please enter all required fields Click to hide
Correct invalid entries Click to hide

© 2008-2021 Mixed Bag Media Pvt. Ltd. Developed By PixelVJ