After CrowdStrike Outage, Companies and Governments Reassess Risks of Using Cloud

After CrowdStrike Outage, Companies and Governments Reassess Risks of Using Cloud
Illustration by The Epoch Times, Shutterstock, Getty Images
Updated:

As companies and government agencies around the world scramble to restore their computer systems following last week’s global outage from a faulty software update, questions are being raised about whether proper protocols for updates were followed.

Simultaneously, technology analysts are raising concerns about the extent of the United States’ increasing dependence on an oligopoly of cloud computing firms.

An antivirus software update issued on July 19 by CrowdStrike, one of the largest cybersecurity companies, caused more than 1 billion Windows-based computers to crash, taking down essential operations at airports, hospitals, 911 centers, police departments, trains, jails, municipal services, and corporate operations.

The company has issued multiple apologies since the event and pledged to resolve the issues, much of which cannot be fixed through system-wide updates but require fixes on individual computers.

CrowdStrike Chief Security Officer Shawn Henry stated on a LinkedIn post: “On Friday we failed you, and for that I’m deeply sorry.

“The confidence we built in drips over the years was lost in buckets within hours, and it was a gut punch. But this pales in comparison to the pain we’ve caused our customers and our partners.”

Cybersecurity experts have raised questions about whether CrowdStrike may have circumvented best-practice procedures when it circulated the July 19 update.

“The cautionary tale, to me, is the basics—for patches, updates, and on critical business systems, take the 10 minutes to test them,” Robert Thomas, owner of cybersecurity company 180A Consulting and a former Defense Department staffer, told The Epoch Times.

“You take one minute and you download the patch; you take another minute, you install the patch on a test system; one more minute, you reboot the system, and then you run tests against your business-critical software applications.”

The Center for Internet Security (CIS) and the National Institute of Standards and Technology (NIST) have created standard protocols regarding how software updates should be conducted. Had they been followed, Mr. Thomas said, the flaws in the update should have become apparent before it was circulated to users.

“Software updates, by best practice/protocol, should go through numerous stages of testing prior to touching a customer,” Tom Marsland, training and project manager of Cloud Range and author of “Unveiling the NIST Risk Management Framework,” told The Epoch Times.

“This would include automated unit testing on the code, security reviews, and testing inside of the CrowdStrike team [and] only once those actions are completed should a patch be rolled out to customers,” Mr. Marsland said.

In addition, updates should be rolled out initially to a smaller group of customers and then expanded, rather than sent out broadly all at once, he said.

“In the case of the CrowdStrike update on Friday, it does not appear those practices were followed,” Mr. Marsland said.

In its post-incident review published on July 24, CrowdStrike stated, “Due to a bug in the Content Validator, one of the two [updates] passed validation despite containing problematic content data.”
image-5693109
image-5693110
(Top) A screen informs travelers that train information is not available because of the global technical outage, at a subway station in New York City on July 19, 2024. (Bottom) People walk past flight information screens during the outage at Chicago O'Hare International Airport on July 19, 2024. Companies worldwide were affected by an outage from a faulty software update issued by CrowdStrike. Adam Gray/Getty Images

The ‘Cascading Effects’ of the Faulty Update

According to an assessment by the CIS, the effects of the faulty update became apparent just after midnight Eastern time on July 19, when computers operating on Microsoft’s Windows software that implemented updates from CrowdStrike’s Falcon security software went down.

The update circulated for about an hour and half until the flaw was discovered and the update was “reverted,” according to the CIS.

“CrowdStrike has since issued a workaround that requires manual remediation for each affected device,” the CIS stated.

CrowdStrike quickly assured customers that the outage was not a cybersecurity attack.

“They’re saying that this isn’t a cybersecurity attack, but it had the same net result as a cybersecurity attack,” Rex Lee, a security adviser to companies and governments, told NTD News, an Epoch Times affiliate. “We’re talking about government agencies, we’re talking about Fortune 500 businesses, airlines ... the cascading effects of this are unbelievable.

“If you look at the critical infrastructure that’s being affected, this is actually going to cause harm and people may be dying as a result of this, because first responders are being affected, hospitals are being affected. We won’t know the total damage from all this, but it’s going to go down in history as the largest mistake and/or outage in the history of the internet.”

The shift by companies and government agencies to cloud computing has been rapid and continues to accelerate.

Global spending on cloud services is expected to grow by more than 20 percent in 2024, to a total of $678.8 billion, up from $563.6 billion in 2023, according to a November 2023 forecast from Gartner, Inc., a tech analytics firm.

“Cloud has become essentially indispensable,” Sid Nag, vice president analyst at Gartner, stated in the report.

But last week’s outage has highlighted the issue of company and societal vulnerabilities because of the extent to which cloud computing services are controlled by a small number of providers.

image-5693108
Staff work in the server farm in the 1450 m2 main room of the CERN Data Centre in Meyrin, Switzerland, on April 19, 2017. Dean Mouhtaropoulos/Getty Images
A January report by Stephan von Watzdorf, a cybersecurity expert at Swiss Re, a global insurance company, highlighted the vulnerabilities of cloud services being concentrated essentially in three companies.

“A decade ago, businesses were uncertain whether the expansion of cloud computing by tech giants like Google, Microsoft, and Amazon was just a passing trend or a lasting shift,” Mr. Von Watzdorf stated in the report. “Today, companies worldwide have embraced the cloud in droves, recognising it as a vital component of successful digital transformation.

“However, the concentration of services with three dominant providers has created new risks, which are relevant to the re/insurance industry.

“If the cloud services go down, the accumulation risk falls on the re/insurers offering commercial cyber insurance products.”

Societal and National Security Risks

Government agencies are also assessing the risks of cloud computing and tech consolidation.

On the day of the outage, a senior White House official stated that “the White House has been convening agencies to assess impacts to the U.S. government’s operations and entities around the country.”

Amid the rush to shift operations onto the cloud, the CrowdStrike outage will likely spur users to reassess the extent of their dependence on one or a few service providers, and their ability to weather errors by providers.

“We’re reaching the point where over-centralization makes us less ‘healable,’ and less resilient,” Mr. Thomas said. “We’re losing our resiliency as a nation.”

After the CrowdStrike outage, companies and governments are now seeing the risks, as well as the benefits.

“There are absolutely societal and national security risks from putting all of your eggs in a single vendor basket, and I think those were clearly indicated in the past 72 hours when we grounded most flights nationwide,” Mr. Marsland said.

“The benefits of the cloud versus the risks is something each organization must answer for themselves,” Mr. Marsland said. “For organizations seeking a broader customer base, the benefits absolutely do outweigh the risks—but these organizations can afford to hire experts in cloud security.”

On a personal level, individuals who store their data in the cloud also face risks.

According to a 2023 report by the Information Security Office at the University of Texas, those risks include security risks, privacy risks, and reliability risks.
image-5693111
Travelers wait for their delayed flight, in the aftermath of the CrowdStrike outage, by the check-in counter at Los Angeles International Airport on July 23, 2024. Mario Tama/Getty Images

Security risks include personal data being exposed “through a security breach or incompetence on the part of the cloud service provider,” the report states, as well as the sharing of personal information with other businesses, government agencies, or employees of the cloud service provider, and malware or phishes that could gain access to your information.

The privacy policies of cloud service vendors “all reveal that the vendor, regardless of any claims to the contrary or use of encryption, has the ability to decrypt and access any stored data whenever they deem it necessary,” the report states.

For companies now seeking to get their computer systems back online, new risks have emerged from hackers looking to seize the opportunity the outage opened for them.

“While remediating affected systems, organizations should be aware that the CIS has detected numerous phishing campaigns and spoofed domains set up by threat actors in an attempt to socially engineer and compromise organizations affected by the outage,” the CIS stated.

CrowdStrike reportedly represents about 15 percent of the cybersecurity market, catering to larger organizations and second only to Microsoft, which has an approximate 40 percent market share, according to Gartner. Its share price tumbled by more than 25 percent since the outage.

Speculation has focused on the firm’s ability to weather the current crisis, retain customers, and continue to grow its business. But beyond that, CrowdStrike may be facing substantial bills from its clientele.

Law firms throughout the country have already announced investigations into damages from the outage, a likely prelude to class-action lawsuits.
CrowdStrike lost one-fifth of its stock value in the wake of the disaster. The firm promised on July 24 to reform the way it issues critical content updates.

Specifically, the company said it is planning to implement a “staggered deployment strategy” for future updates, first sending them out to just a handful of machines before a global rollout. This method is known in the industry as a “canary deployment.”

CrowdStrike will also “enhance existing error handling in the Content Interpreter,” which is part of the Falcon Sensor.

CrowdStrike also promised to use humans to test its Rapid Response Content, add extra validation checks to the content validator, and give customers the option to decide when and where these updates are deployed.

AD