The CrowdStrike IT outage is pretty grim but let’s hope it’s not as bad as the infamous Dyn DDoS attack, Facebook mega-crash, and Rogers network disaster

This one looks as bad as they come and there have been some truly massive IT problems in recent years.

This one looks as bad as they come and there have been some truly massive IT problems in recent years.

With millions of people around the world affected by the apparent bug in CrowdStrike’s Falcon software—bringing down IT systems in media, hospitals, and airports around the world—there’s a good chance it will go down in the annals of tech history as being the worst outage ever, putting it ahead of the likes of 2016 Dyn DDoS cyberattack, Facebook’s server woes in 2021, and Canada’s biggest IT failure ever in 2022.

While it’s not fully clear exactly how or why millions of PCs went into an endless BSOD (Blue Screen of Death) cycle overnight, the culprit appears to be a bug in a security update for a piece of software developed by CrowdStrike, called Falcon. Somewhat ironically designed to prevent malware and other cyberattacks, it’s a salient reminder that the modern world is almost entirely reliant on client computers, servers, and the Internet.

So much so, that malicious actions or simple mistakes can lead to enormous IT problems, affecting millions. The worst cases that came to my mind, when reading about today’s global outage, are ones that affected countless folks in many countries. One of the most notorious was the Dyn DDoS attack in 2016.

Dyn is a DNS provider—a company that manages servers which translate web requests for a particular domain name (hence Domain Name Server, DNS) into an IP address. DDoS stands for Distributed Denial-of-Service and in this instance, it was caused by tens of millions of IoT (Internet-of-things) devices, such as printers and security cameras, all infected with the Mirai malware, requesting a domain name look-up.

Such was the mass of requests and the complexity of the attack that Dyn’s services were brought down for an entire day, resulting in thousands of Internet-based services and platforms being shut down throughout the United States and Europe. Several hacking groups claimed responsibility and one individual pleaded guilty to taking part in the attack in 2020, but the full case of who was behind the whole thing has never been fully resolved.

But it’s not always a malicious event that can take down a major IT system. For example, in 2021, Facebook and all its subsidiaries went offline for about 12 hours, across the entire globe. This was a problem caused entirely by Meta itself. While you might think it just meant people couldn’t share photos or send messages, it had a huge impact in developing countries, and even unrelated systems—such as Google’s services—were ground down to a crawl because of it.

It was a similar situation for Rogers Communications, a telecommunications company that provides Internet and mobile services in Canada. Millions of web users were left with access for a day but far more seriously, emergency services, banks, and payment systems that used the Rogers networks were also left stranded. The cause? A maintenance upgrade that went awry.

I don’t know whether the CrowdStrike Falcon problem will go on to top these events, in terms of the number of people affected and the cost to various economies, but it’s something that will almost certainly happen again, at some point in the future. People make mistakes—to err is to be human, after all—but global IT outages are timely reminders that every critical system should always have redundancy and recovery.

About Post Author