Facebook explains what caused its widespread outage
Yesterday, Monday, at around 5 PM, Facebook and its family of apps, including Instagram and WhatsApp, went totally inaccessible for hours bringing the whole world that that uses these apps for their day to day communications to a blackout. Reality hit the billions of users as their dependence on this company was put to test like never before during the over five-hour outage. More than 3.5 billion people around the world use Facebook, Instagram, Messenger, and WhatsApp to communicate and do business. There are other users that use Facebook to sign in to many other apps and services.
Facebook apologized for the outage on Twitter and asked users to be calm, who at this time had turned to the platform to lament and poke fun at their inability to use the apps. The hashtag #FacebookDown started trending alongside that of WhatsApp and Instagram. Eventually, Facebook managed to restore the platforms.
The company said the underlying cause of the outage did not impact the users only, but also many of their internal tools and systems that are used in their day-to-day operations, therefore complicating the attempts to quickly diagnose and resolve the problem. Santosh Janardhan, Facebook’s VP of Engineering and Infrastructure further explained that the configuration changes on the backbone routers that coordinate network traffic between the company’s data centers caused issues that interrupted the communication, thus disruption to the network traffic had a cascading effect on the way the data centers communicate, bringing the entire platforms’ services to a halt.
“Our services are now back online and we’re actively working to fully return them to regular operations. We want to make clear that there was no malicious activity behind this outage — its root cause was a faulty configuration change on our end. We also have no evidence that user data was compromised as a result of this downtime,” said Santosh Janardhan.
Santosh assured users that the company was looking into understanding more about what happened so that they can continue to make the infrastructure more resilient to prevent such outages that affect billions of people and businesses from occurring again.