The CrowdStrike Outage: A Wake Up Call for Cybersecurity

by

Jan BroucinekOn July 19, 2024, Cybersecurity software company CrowdStrike made history. They were responsible for the world’s most significant global computer outage. It’s as bad as it sounds—with millions of computers they’d sworn to protect suddenly in a “blue screen of death reboot loop.” The net effects hit some of the world’s most critical organizations—including airlines, federal agencies, hospitals, banks, emergency services, and more. 

What happened during this outage, and what should we do to keep it from happening again? That’s what I’ll be digging into here. 

 

What Is CrowdStrike, and Why Was Their Outage So Widespread?

CrowdStrike, a publicly traded cybersecurity firm based in Austin, Texas, makes Endpoint Detection and Response (EDR) software. EDR is a critical part of most cybersecurity plans because it monitors all the devices on your systems—all the endpoints, as they are called. A potent tool, CrowdStrike watches for suspicious activity on your network and isolates it before the effects can spread. 

The CrowdStrike outage is widespread because the product is popular. In fact, CrowdStrike controls about 15 percent of the market for endpoint protection tools—second only to Microsoft, which covers about 40 percent of the market. The effect was so massive because large companies tend to be the ones who use it most. A whopping 298 companies in the Fortune 500 were using CrowdStrike when the outage hit. Even worse, the outages affected 75 percent of the top healthcare organizations and banks. 

 

Key CrowdStrike Features: 

Organizations using CrowdStrike get best-in-class features, such as: 

  • Threat Detection: Using advanced AI and machine learning to identify potential threats in real time.
  • Incident Response: Providing tools to investigate and respond to security incidents quickly.
  • Behavioral Analysis: Monitoring the behavior of applications and processes to detect suspicious activities.
  • Cloud-Native Operation: Working through the cloud, allowing for rapid updates (and BSOD) and scalability. This is a significant selling point because it keeps the product continuously updated and secured.

However, the ability to rapidly push updates caused the CrowdStrike Outage. Let’s discuss that. 

 

What Happened During the CrowdStrike Outage? 

On Friday, July 19, 2024, at 04:09 UTC, as part of regular operations, CrowdStrike released a content configuration update for the Windows sensor, which gathers telemetry on possible novel threat techniques. Microsoft had issued an update to their systems, which required CrowdStrike to issue new updates, too. CrowdStrike implemented its new changes and pushed the latest update to every device on the systems of the 3500+ companies it serves. 

Unfortunately, CrowdStrike’s Rapid Response Content update portion had an undetected error—one small file that didn’t agree with the systems reading it. This fault wasn’t discovered until it was pushed out to millions of machines. Those machines started crashing, looping in the dreaded “blue screen of death” that users couldn’t escape. 

 According to Microsoft’s telemetry, 8.5 million Windows workstations and servers were affected. However, Microsoft recently said that this number is only from the systems that had telemetry turned on. The actual number of workstations was, most likely, far higher than initially reported. 

Systems in scope include Windows hosts who received the update and were running sensor version 7.11 and above that were online between Friday, July 19, 2024, 04:09 UTC and Friday, July 19, 2024, 05:27 UTC.  

One in four Fortune 500 companies experienced a service disruption due to Friday’s global IT outages and likely lost a combined $5.4 billion, according to a new report from cyber insurer Parametrix.  

The Atlantic magazine best summarized the situation when it said, “Crucial systems across the world collapsed on Friday, triggered by one mistake in a single company.”

 

How Bad Did the CrowdStrike Outage Get?

Here are just a few highlights: 

  • New York City initially estimated that 90,000 systems were affected, but the number ballooned to more than 300,000. Unfortunately, the old Axiom “once bitten, twice shy” is coming into play. New York’s CTO said an automated fix recently introduced by CrowdStrike could speed up the remediation. But he doesn’t want to take software solutions from the embattled tech company at “face value.”
  • The travel industry was hit hard, as airlines and airports in Germany, France, the Netherlands, the UK, the US, Australia, China, Japan, India, Singapore, and Taiwan had major issues. Some railways, such as Union Pacific, were also affected for a time. Delta Airlines struggled with the issue for nearly a week, with massive flight cancellations. 
  • Consider this recent report from Bank Info SecurityThe weighted average of losses varies by industry, ranging from $6 million for a manufacturing firm to $143 million for an airline, Parametrix Solutions said. Total losses for healthcare firms will reach an estimated $1.9 billion, and for banking firms, $1.1 billion. “Companies in these sectors take 57% of the loss but account for only 20% of Fortune 500 revenues due to the uneven impact of the event on business sectors,” the company said.
  • Warren Buffett and Berkshire Hathaway’s top insurance executive warned this year about the potential for massive losses from cyber insurance policies. The CrowdStrike-caused global IT outage will be a crucial test for cyber insurance underwriters. Fitch Ratings expressed confidence that losses will not exceed $10 billion. 

 

The CrowdStrike Outage:  What Should Regulators Do Now? 

CrowdStrike has suffered a major reputational hit—and by association, so have all the companies that were victims of the outage. Suffice it to say that many people are asking:  “How can our systems be so vulnerable to a single point of failure?”

 

Questions at the Top:

Transportation Secretary Pete Buttigieg commented at a news conference Friday in East Los Angeles: “A lot of people around the country and the world are shocked to discover that a single issue with a single piece of software can have that many knock-on implications. So … that’ll be a question that goes to the design of our systems for the long term.” 

Stay tuned. There will be more coming on this, including hearings on Capitol Hill. 

 

CrowdStrike Response:

While government investigations will undoubtedly ensue, CrowdStrike has stated that it will do better, conduct more thorough testing for its updates, and release them on a staggered basis. It’s a start, and regulators may soon require this from tech companies. 

 

The European Commission:

Microsoft noted that a security problem was created by the European Commission, who in 2008 started requiring the company to give third party security app developers the same level of access to its Windows OS as Microsoft itself gets. This is great for app development and competition. Unfortunately, it makes it impossible for Microsoft to wall off its systems when a third-party provider has a security emergency. This policy is likely to be re-examined. 

 

What Can You Do to Protect Your Company from a CrowdStrike-Like Outage?

There’s still much to do before this kind of outage is a thing of the past. However, there are some things you can do to minimize the chances of this happening to your company. 

First, ask for cyber risk insurance that covers business losses from “dependent systems.” Not every cyber risk policy covers this, and it involves specific riders to make this happen. For a great overview of this subject, check out the recent blog we published from our cyber risk insurance partner, EA Risk Partners 

Next, ask outage-related questions when you purchase software systems that reside on your system. Ask them how their systems are tested and, more importantly, how their updates are released. Will they be pushed through automatically? Is there a way to quarantine the tool if there’s any problem? How are help tickets handled? Do they stagger releases to avoid large-scale mistakes? 

Finally, if you haven’t done a security review of all your tools in a while, it’s a great time to schedule one. You can examine your tools for vulnerabilities, opportunities to backup, and compatibility with each other.  

 

Interested in Hardening Your Cybersecurity? Integris Can Help.

Our CISSP-certified vCISOs can work with your company on a retainer basis so you can harden your cybersecurity infrastructure. Contact us today for a free consultation. 

Jan Broucinek is a Security Operations Manager at Integris. With over 30 years of experience in the IT space, Jan maintains the health and safety of our clients via proactive monitoring and updating.

Keep reading

The Role of Cybersecurity in IT Support for Law Firms

The Role of Cybersecurity in IT Support for Law Firms

When it comes to hiring IT support for law firms, too many practices are stopping short of making the cybersecurity investments they need. In fact, according to the American Bar Association Tech Report, nearly half of all firms are missing one or more of the key...

How to Run Governance on Your Security Awareness Training Program

How to Run Governance on Your Security Awareness Training Program

Has your company decided to take the plunge, and start a regular schedule of monthly online security awareness trainings for your employees? Great! You’ve just taken a big step toward hardening your cybersecurity defenses. Now what? Chances are, you’ve purchased a...