In a world where it seems that all types of events and incidents are daily occurrences, the idea of managing risk is one that all of us are familiar with. That said, different people have different ideas as to what it means to “manage risk”.
Some see the practice of managing risk as merely purchasing insurance to lessen the financial impact of a loss. Others look to identify risks within a given operation and bring awareness to leadership. Then, there are those who have a robust program in place that looks at categorizing risk, controls, ownership, along with having robust practices to identify, measure and mitigate those risks that exist within the operational environment.
The first two categories—simply purchasing insurance or identifying and assigning ownerships to risk components—is not enough.
It is crucial in this day and age that we identify, understand and thoroughly prepare our organizations and stakeholder groups to deal with Critical Risks.
Critical risks are not Black Swans, they are more like Grey Rhinos that are highly probable, highly impactful, and most times we neglect to properly identify and plan for them accordingly.
“Gray rhinos are not random surprises, but occur after a series of warnings and visible evidence.”
– Michelle Wucker.
So, if critical risk falls into the category of a Gray Rhino, the question becomes What is Critical Risk Management? Is this the new hot thing to replace Operational Risk Management or Enterprise Risk Management? Let’s explore and find out.
Defining Critical Risk Management (CRM)
Critical Risk Management is the practice of managing risk for events that can cause grave damage to an organization and result in serious outcomes such as fatalities, wide-spread outages, etc. These events are the “Show Stoppers” within your organization, which if they come to reality, there is high potential of the organization not surviving the incidents.
Critical Risk Management is not meant to replace other programs that manage risk within your organization. It is an additional component to a robust Enterprise Risk Management (ERM) program to ensure these Show Stoppers are understood and that proper defenses are utilized to ensure that the organization understands the conditions that must be in place prior to interacting with the risk(s).
Traditional Viewpoints vs. New Viewpoints
Traditional definition of risk:
A probability or threat of damage, injury, liability, loss, or any other negative occurrence that is caused by external or internal vulnerabilities, and that may be avoided through preemptive action.
– Business Dictionary
Traditional way to calculate risk:
Impact x Likelihood = Inherent Risk
Inherent Risk – Controls = Residual Risk
The traditional way is a good starting point and is an accurate way to measure financial risk. However, for Critical Risk, this method makes it difficult to truly understand the variables, conditions, and capacity that the organization must have to successfully interact with the risk.
For example, think of assessing risk for driving to Disney World on a family vacation. We think of traffic patterns, the driver and the road conditions. Truly understanding variables, conditions, and capacity means that we would also evaluate weather, tire tread, mechanical state and safety features of the vehicle, and finally, we would also want to understand response time and availability of roadside assistance, ambulances, and medical facilities along way. This way, the entire ecosystem, variables and conditions are accounted for, giving you the most comprehensive understanding of the risk and its possible impacts.
New definition of risk:
The degree to which the Organization and/or the Operational Actor faces operational uncertainty.
— Erick Anez
New definition of risk management:
The identification and management of the Organization’s Capacity in order to ensure proper interactions with risk(s) while operating in complex and adaptive environments.
— Erick Anez
New way to calculate risk:
Asset/System + Hazard/Threat + Human Component + Likelihood = Inherent Risk
Inherent Risk (+ / -) Pathway(s) (+ / -) Controls = Residual Risk
Let’s make all of this real with an example.
Let’s put this to work with a high-level example of a Critical Risk Assessment for Airline Accidents. Airline X is looking to assess the risk of an airline accident. Here are some facts about Airline X:
- Fleet of 200 Aircrafts (5 Different Types to include: Boeing 747, 737-88, MD 80, MD 88, MD 90)
- Crew of 2,500 Pilots
- Average of 1,250 flights per day
- Operates out of 10 U.S. Airports
Do we have enough information to assess the risk of an airline accident? Under the traditional way to measure risk, we would run the following assessment:
Inherent Risk: Medium-Low
- Impact = High
- Impact is high due to the likelihood of a large number of fatalities and the reputational damage stemming from the airline may be severe enough to cause operations to seize.
- Likelihood = Very Low
- According to the NTSB, Bureau of Transportation Statistics, there were 140 plane accidents during 2012-2016 and only 1.4% of those (2 accidents) resulted in fatalities.
Controls:
- Scheduled and Routine Maintenance
- Checklists, Procedures, Safety Training (Crew)
- Airport Controls; Bird Strike Prevention
- Weather monitoring/reporting
- Runway monitoring software
- Evasive maneuvers
- Emergency Landing Techniques
- Reinforced windshields on aircraft
Residual Risk: Low
New way to calculate risk:
Asset: Boeing 747
- Age of Aircraft
- Mechanical History
- Aircraft type recalls, mechanical issues, previous incidents
- Maintenance History
System: Integrated Avionics
- Operating system known issues
- Operating system performance – Historical Performance
- Recalls, bugs, industry performance/issues
Hazards/Threats:
- Mechanical Failure
- Weather (Wind, Thunderstorms, etc.)
- Lightning Strike
- Sabotage
- Terrorism
- Electrical Fire
- Maintenance Negligence
- Aircraft Design & Manufacturing Defects
- Airline Corporate Negligence
- Air Traffic Control Negligence
- Runway Issues
- Object/Animal Strike
- Collision with Other Aircraft
- Pilot/Crew Member Intentional Crash
Human Component:
- Experience and Reliability of Pilots
- Experience and Reliability of Maintenance Crews
- Experience and Reliability of Airport Operations & Air Traffic Operations
- Passengers
- Ground Crew
Inherent Risk: Utilize a Risk Level Scale like the below. Feel free to modify levels to match your program.
Pathways: What are the ways that the organization interacts with the risk(s)?
- Maintenance Visits/Cycles
- Flights – Entire Cycle from boarding through landing at arriving destination
Controls (in addition to those mentioned under traditional view):
- Pilot Training, Health & Background Checks
- TSA Security Controls
- Safety Controls of Operating System
- Safety Controls of Air Traffic Control systems
- Fleet Maintenance Scheduled (Verified to match best practices & FAA requirements)
- Aircraft Safety Components
Residual Risk: Inherent Risk (+/-) Total of Pathways & Controls. Each pathway and control would have to be assessed and scored, remembering that more information is also necessary.
Utilize a Risk Level Scale like the below. Feel free to modify levels to match your program.
Bear in mind that an assessment similar to the above would have to be performed for each type of aircraft in order to identify differences in the Assets and Controls.
These assessments will show Airline X the true measurement of risk per aircraft type. These can be aggregated to create an overall assessment of the risk.
Why this new way of calculating Critical Risk is so essential for your Crisis Readiness
Knowing the whole picture of your operating environment and all of its components will better place your organization to ensure safe and reliable performance in the complex and adaptive environments in which we operate today.
Sadly, if you take the time to research the reasons behind most airline accidents (55%) are classified to be “Human Error”. It is worth noting that no person, system, and/or organization is perfect. If we design systems that are counting on the Operational Actor (Human Component) to be “Perfect”, this is in fact a flawed system.
Systems must be robust enough to ensure safe and compliant usage, but nimble enough to ebb and flow with the types and severity of hazards and threats. By labeling 55% of these incidents as Human Error, we fail to learn where we can increase defenses in order for these events to not repeat themselves. We can either blame the operator or we can learn from the event; we cannot do both.
Choosing to learn from the event, adapt and evolve is the Crisis Ready® way.
Crisis Ready® organizations hold themselves accountable and as such, choose to learn from events that lead to both successes and failures. Which type of organization are you today? What type of organization do you aspire to be tomorrow?
It is our responsibility as Crisis Management professionals to “Hold up the Mirror” to our organizational leaders and ask if they like the organization’s reflection. It is by taking this approach that we can get from point A to point B and ensure readiness, preparedness, accountability, and true resilience.