The incident on July 19, 2024, brought a significant part of the global economy to a standstill. For several long hours, many sectors were unable to operate or recover from the outage, underscoring the interdependence of digital systems and the risks associated with third parties. The disruption, classified as a black swan event—an unpredictable occurrence with significant consequences that seems unavoidable in hindsight—emphasized the need for all companies to strengthen their business continuity plans to better withstand future IT disruptions. The incident sparked a flurry of hypotheses, with some believing it to be a cybersecurity attack targeting system availability. However, this hypothesis was quickly debunked following a statement from the service provider, who cited that an update to their security software had caused computers to fail to restart, resulting in the Blue Screen of Death on Microsoft Windows systems. In this article, we’ll examine the impacts of the incident and identify key strategies for ensuring robust and sustainable organizational resilience.
Immediate impact
The outage affected 8.5 million Windows devices worldwide and caused a financial loss of over US$5.4 billion in the US alone. Various sectors were affected, including financial services, healthcare, stock markets, transportation, education, private businesses and several government entities. IT teams were mobilized, as automatic troubleshooting and rollback were not possible. A manual workaround was needed directly on users’ computers.
Impacts over a few days:
- Shutdown of operations, customer complaints and overloading of call centers and customer service departments
- Long wait times for on-site IT support
- Triggering of business continuity solutions and manual workarounds
- Unusually heavy demands on IT team resources, long working hours, fatigue and stress
The importance of business continuity
As IT infrastructures become ever more complex and cybersecurity attacks ever more sophisticated, organizations need to be proactive when it comes to organizational resilience in the event of a major event or crisis. Preparedness in crisis management, business continuity and IT recovery makes it possible to:
- Minimize losses and damage to the organization and its employees in the event of a crisis
- Avoid legal problems, as non-compliance can lead to fines and lawsuits
- Ensure that the organization achieves its minimum business continuity objective (MBCO), thereby protecting its reputation and standing
Drawing on best practices from different industries as well as international standards such as ISO 22301, we recommend the following steps to build greater resilience and better prepare for managing various disruptions:
- Understand the organization’s continuity needs: Become familiar with the mapping of processes and activities and carry out a risk analysis and a business impact analysis (BIA). Evaluate maximum tolerable downtimes (MTD) and define recovery time objectives (RTO).
- Define continuity strategies: Compare options to determine the most suitable continuity solutions. Establish a business continuity policy for the organization.
- Deploy continuity solutions: Define and draft detailed incident response, crisis management, business continuity and IT recovery plans (redundant systems, failover and rollback capability), create adaptable scripts and establish clear communication channels.
- Test and train: Develop awareness campaigns, training courses and exercises for employees.
- Maintain and improve the continuity mechanism: Administer the BCMS (business continuity management system) as well as continuous improvement and periodic reviews.
- Service providers and third parties: Thoroughly assess the crisis management, business continuity and IT security capabilities of critical partners’ systems.
It is essential for organizations to be adequately prepared to minimize interruptions and maintain quality service for their clients. This preparation also contributes to building a more sustainable and resilient society, capable of withstanding increasingly frequent crises. By adopting these measures, you’re not only contributing to the stability of your organization, but also to the creation of a more robust global environment.
Arnaud Mangematin, Director, Consulting Services
Gaby Abou-Haidar, Resilience Consultant
Serge El-Hage, Security Consultant