Quantitative Risk Management with FAIR — Communicating Risk

Now that we’ve been through the calculations to arrive at a number with regards to our risk exposure, we now discuss how to communicate it.

Remember, you’re not on the land of “qualitative risk management” and all you’re being asked is to position a risk in a 4×4 matrix. You have numbers and a way to deal with uncertainty of the loss event, so your communication should reflect it.

Generally, you’ll want to focus mostly on “most likely” values, also alluding into what you perceive as the maximum, and rounding the numbers make it more legible too. So in the previous post, instead of writing “£481,453.07” of Loss Magnitude, we could and should just communicate “£480.000”.

Translating the Loss Event Frequency into readable format also helps. What I mean by this is, instead of saying “The most likely realisation of this risk can happen 0.5 times a year” you should say “I believe this risk can risk materialise from once every 10 years, to a maximum of once a year in its current state. The most likely materialisation would be once every 2 years, according to the assessment”.

Preparing the 3 stories

FAIR is predicated on a minimum, most likely and maximum value for most of its calculations exactly because it allows the analyst to consider multiple ways that a risk can materialise and the business impact it could have.

I propose that a very simple write-up of what’s included in each of the 3 levels is presented so senior management can understand the basis for the calculations.In our Ransomware scenario, I’d word is as the following:

Minimum value — this is a scenario where we only have one asset infected, and all the recovery processes run smoothly, however as we have similar controls across the IT estate and most processes haven’t been tested, it’s unlikely it would be this smooth.

Most likely — in this scenario, most of our other IT devices of same category also get infected and we have a more realistic recovery and response for this wider scale given the fact we don’t have experience or playbooks for this type of incidents.

Maximum — in this scenario, the worst is considered which mainly translates into our inability to recover our Collaboration drives from our untested backups, and need to manually rebuild from data we can get our hands to. There’s continued impact for about a month, and recovery is never to the same level of operational insights we had prior to it

This sends a strong signal that you, as the analyst, have made an effort not just portray “doomsday scenarios” and that the world will fall apart, but that you’ve considered a range of different levels of impact taking in consideration the available data.

Using qualifiers

With Open FAIR, there are 2 types of qualifiers which are used to portray additional considerations which aren’t typically reflected in numeric data.

These are: fragile qualifiers, and unstable qualifiers.

Fragile qualifiers represents “conditions where Loss Event Frequency is low in spite of a high Threat Event Frequency, but only because a single preventative control exists. In other words, the level of risk is fragile based on a single point of failure”(OpenGroup, 2014).

In our scenario, because we have a traditional Anti-virus product and new vulnerabilities and variants keep being found in use by cyber criminals, we can assert our level of risk is fragile.

Unstable qualifiers are “used to represent conditions where Loss Event Frequency is low solely because Threat Event frequency is low. In other words, no preventative controls exist to manage the frequency of loss events.”

In our scenario, as we don’t have appropriately filtered Internet access, we can say that the level of risk is also unstable.

These qualifiers are used to provide further context to the decision-maker, as if we were to just look at Loss Event Frequency on many of our risks and particularly if they are low, stakeholders could be lulled into a state of complacency or have a false sense of security in relation to it.

Living in a Qualitative culture and communicating accordingly

Just because your organisation is “stuck” in a qualitative risk management approach, it doesn’t imply that you, as the risk analyst,, need to feel stuck there too. FAIR is about improving the strength, repeatability and thoroughness of your own assessments.

If you decide to adopt and use FAIR methodology to inform a structure into your risk assessments, after you’ve finished them you can always go look for the 3×3, or 4×4 matrices that your organisation may use and plot your assessment into it. Just make sure particularly the scale of loss magnitudes are approved by management (ie a ‘Red risk’ could start at an impact of £1M or £100M, and should be someone in a business capacity setting that risk appetite).

Capacity and Tolerance for Loss

“Capacity for loss is an objective measure, whereas tolerance for loss is subjective” (OpenGroup,2014).

In the words of the great and wise Rocky Balboa, capacity for loss is “about how hard you can get hit, and keep moving forward”. So, it’s about how much financial impact it can absorb and still remain out of bankruptcy and administration. This typically relates to the existence of capital reserves or availability of supplies, then in the event of financial impact or operational failure, an organisation could cope with those events.

Tolerance for Loss is more subjective, as it’s not so much about what’s financially possible as it is about what senior or executive management feels comfortable with. For instance, you could have the financial capacity to withstand the £1M loss event from a Ransomware attack (as in our scenario) but executive management decide they’re more risk averse to such a loss, and decide to invest in improving the control environment so that the worst case scenario’s impact can be reduced to a more palatable financial impact.

Developing business cases for Risk treatment

Risk management is pursued in order to inform decision-making at the appropriate levels within an organisation.

Following a similar structure and approach (whether quantitative or qualitative, though the former is less prone to analyst bias), allows us to assert the risk issues which are more critical to the organisation so it gives us a framework we can use to prioritise risk treatment, inform risk exposures so we can identify risks we won’t worry about and as such undergo an acceptance by a senior stakeholder, or inform the creation of a budget for how much to spend in addressing a particular risk by mitigating it wholly or partially. If the risks are deemed too great, another viable option is also to avoid the risk, which typically entails “stop doing the risky thing”, which isn’t often an acceptable business practice. In our Ransomware scenario, risk avoidance would mean “stop providing customer service” which isn’t a business-oriented response or practice.

Conclusion

In this article series, we started from an hypothetical Ransomware scenario and assumptions and worked our way through determining Loss Event Frequency and Loss Magnitude for a scenario, taking in consideration both industry reports and assessment of our control environment for dealing with that particular threat from industry credible sources.

Knowing our business expenses, allowed us to take a look at different impact levels and the financial implications of operational downtime and recovery efforts, so we could make some assertions regarding minimum, most likely and maximum impact we could expect from a Ransomware attack materialising, particularly referring to the 6 types of losses from FAIR (Productivity, Response, Replacement, Fines and Judgements, Competitive advantage and Reputation).

In this last post, we now had a look at how to communicate these numbers and hopefully try and build case for improving your control environment or in any way reducing the risks your organisation is exposed to.

Hope you’ve enjoyed it!