CrowdStrike: Logic error caused blue screen of death on computers

The global computer outage was caused by a logic error, security firm CrowdStrike said in an analysis.
CrowdStrike: Logic error caused blue screen of death on computers

A global computer outage that halted numerous worldwide systems has been traced back to a logic error in a routine update from cybersecurity firm CrowdStrike. The company released a configuration update for its Falcon security software, which resulted in the notorious blue screen of death (BSOD) on affected Windows systems.

CrowdStrike’s Falcon platform is known for its real-time threat detection capabilities. The update in question, intended to enhance the detection of malicious activities, instead caused a system crash. According to CrowdStrike’s analysis, “The configuration update caused a logic error that resulted in a system crash and blue screen of death (BSOD) on affected systems.”

Typically, CrowdStrike issues several updates daily to adapt to new cyber threats. This specific update was designed to detect new, malicious named pipes used by malware. Named pipes are communication channels used in Windows for interprocess communication. However, the update inadvertently included a logic error, crashing the operating system.

The update’s logic error affected systems running Falcon sensor for Windows version 7.11 and above that were online between 04:09 and 05:27 UTC on July 19, 2024. The configuration files, referred to as “Channel Files,” are integral to Falcon’s behavioral protection mechanisms. These files reside in the C:\Windows\System32\drivers\CrowdStrike\ directory and have names starting with “C-“. The problematic file, Channel File 291, was designed to target malicious named pipes but instead caused a system crash.

CrowdStrike identified and acted quickly to resolve the issue, updating the content of Channel File 291 to fix the logic error. Despite this, the outage has had a significant impact, disrupting operations across various sectors, including airlines, hospitals, and businesses.

Zach Vorhies, a former Google employee, shed light on the technical details behind the failure on social media. He explained, “It was a NULL pointer from the memory unsafe C++ language.” Vorhies elaborated that in C++, address 0x0 is used to signify an absence of value. When the program attempted to access this address, it caused the system to crash. He provided a simple example: “Programmers in C++ are supposed to check for this when they pass objects around by ‘checking for null’.”

Many users and organizations expressed frustration over how the faulty update passed quality control. Vorhies pointed out that modern tools could prevent such issues by automatically checking for these kinds of errors. He suggested that CrowdStrike might consider moving from C++ to a safer programming language like Rust, which does not have these null pointer issues.

Recovery from this global chaos may take a while, as the global IT disruption has impacted businesses and services worldwide. Holidaymakers face travel disruptions, while scammers exploit the crisis with phishing attempts targeting small businesses. Millions of computers need individual fixes, prolonging recovery. Experts have compared this outage to the pandemic, highlighting the need for improved resilience and contingency planning.

Posted by Alex Ivanovs

Alex is the lead editor at Stack Diary and covers stories on tech, artificial intelligence, security, privacy and web development. He previously worked as a lead contributor for Huffington Post for their Code column.