David Skegg is a chartered Fellow and life member of the Australian Institute of Health and Safety and former lecturer at the Central Queensland University Accident Forensics Laboratory. He’ll share his thoughts about mapping systems resilience at a REBOK lunchtime webinar on Tuesday 15 December 2020. Register here.
What is systems resilience and how is it different to reliability and redundancy?
Resilience is the ability of a system to give you a desired outcome in abnormal circumstances. This is different to redundancy and reliability. A reliable system will give you a consistent output within the parameters of its design. A resilient system goes beyond this by providing an acceptable output even in circumstances that weren’t considered in the design.
Redundancy makes a system more reliable by duplicating system components. For example, the Boeing 747 has four engines even though it can fly on one. This means it has one chance in a billion of critical systems failure. It doesn’t necessarily make it a resilient system, though.
Why is systems resilience important for engineers?
Engineers need to consider what they want their systems to do in abnormal circumstances. Many designs, particularly those for electrical and mechanical systems, include a failsafe mechanism. This causes the system to shut down in unexpected conditions which could cause an accident. But this isn’t resilience. Resilient systems continue to work safely when things don’t go as predicted. This includes situations where people do unpredictable or dangerous things.
Can you give an example of a resilient system?
One current example is the ability of small businesses such as coffee shops to pivot from the way their usual practices and continue to trade under COVID-19 restrictions.
Another example is the process medical practitioners go through when deciding to prescribe medication to patients. Medical practitioners base their systems and thinking on the probability that the medication will have the desired effect on the patient’s condition. However, they allow for variables such as dosages and patients’ physiology, and try to make sure the treatment will still have an acceptable result if these variations come into play. On the other hand, medical equipment is usually designed for reliability rather than resilience.
How can we get better at designing resilient systems?
In a perfect world, systems would never fail, but our world is not perfect. Just think of the number of systems which need to work together and absorb variations to have a successful mobile phone conversation.
Nassim Nicholas Taleb wrote about a concept called the ‘Black Swan’. This is an event that you can’t forecast, which has catastrophic outcomes. If your system is sufficiently resilient, Black Swans can’t occur. This is because the system is capable of absorbing all of the variations of all of its parts. Erik Hollnagel’s work in functional resonance analysis is also useful. It allows for complexity, because we live in a very complex system.
Who do you think would benefit from your webinar?
My webinar will focus on ways to measure systems resilience. The secret is measuring over specific time intervals. The method I’ll discuss is establishing a grid of functional system components. Managers will benefit, as it will help them understand how best to allocate resources to make sure their systems can cope with the shock of the unexpected. It will also help them think about how much resilience they want to build into their systems and why. Engineers will also benefit, as it will help them understand how to design and monitor resilient systems.