VMware Carbon Black causing BSOD crashes on Windows
Windows servers and workstations at dozens of organizations started to crash earlier today because of an issue caused by certain versions of VMware’s Carbon Black endpoint security solution.
According to some reports, systems at more than 50 organizations started to display the dreaded blue screen of death (BSOD) a little after 15:00 (GMT+1) today.
Faulty AV rule set
The root of the problem is a ruleset deployed today to Carbon Black Cloud Sensor 184.108.40.2069 – 220.127.116.118 that causes devices to crash and show a blue screen at startup, denying access to them.
Microsoft Windows operating systems impacted by the issue are Windows 10 x64, Server 2012 R2 x64, Server 2016 x64, and Server 2019 x64.
On systems impacted by the issue, the stop code may identify the error as “PFN_LIST_CORRUPT.”
Tim Geschwindt, an incident responder for S-RM Cyber, told BleepingComputer that starting at 15:30 (GMT+1), clients started to complain that their servers and workstations were crashing and suspected Carbon Black to be at fault.
After investigating, the researcher determined that all clients running Carbon Black sensor 18.104.22.1683 were affected. “They couldn’t boot into any of their devices at all. Complete no go,” Geschwindt said.
One admin said that they had about “500 endpoints BSOD across our estate from approx 15:15 UK time.”
It appears that there is a conflict between Carbon Black and AV signature pack 22.214.171.124.
VMware explains in a knowledge base today that “an updated Threat Research ruleset was rolled out to Prod01, Prod02, ProdEU, ProdSYD, and ProdNRT after internal testing showed no signs of issues.”
An investigation is ongoing right now and the troublesome ruleset is being rolled back, which is expected to eliminate the problem.
As a temporary workaround, VMware recommends putting sensors into Bypass mode via Carbon Black Cloud Console. This enables affected devices to boot successfully so the faulty ruleset can be removed.
VMware is advising clients experiencing this issue to open a support case and include the following info: Org_Key, Device Name(s), Device ID(s), and Operating System(s).
Update [August 23rd, 17:50]: VMware has provided the following statement for BleepingComputer shortly after publishing the article:
“VMware Carbon Black is aware of an issue affecting a limited number of customer endpoints, where certain older sensor versions were impacted by an update of our behavioral preventative capabilities. The issue has been identified and corrected, and VMware Carbon Black is working with impacted customers.”
Update [August 24th, 12:02]: VMware updated its knowledge base article with steps to check if the correct ruleset is present on endpoints not caught in a boot loop