Tuesday, January 14, 2014

Unknown Unknowns – Being Safe While Dealing with Uncertainty

In the early 1980s, following major industrial accidents such as Three Mile Island, Charles Perrow proposed an interesting theory called “Normal Accident Theory.” The idea behind Normal Accident Theory is that more and more humans are operating in complex environments because of advances in technology and complex social structures. This complexity creates situations where it is difficult, if not impossible, to predict all the ways a system can fail, and if we can’t predict all the ways the system can fail then we can’t necessarily prevent all of the failures. Therefore, the theory proposes that we should essentially get used to major accidents happening (hence the name “Normal Accident Theory”) and take advantage of these accidents to learn what we can when we can about the system in a way that is virtually impossible through other means.

The theory is an interesting one, to say the least, but we here at SCM are not so pessimistic to say that accidents should be considered “normal.” (And we should say that a number of very interesting and useful theories of organizational risk management have come as a response to Normal Accident Theory, such as High Reliability Organization Theory and Resilience Engineering.) However, Normal Accident Theory does a good job of highlighting an important concept that many safety professionals often don’t directly address – uncertainty.

We all intuitively would admit that none of us knows everything there is to know about the operations that we oversee. It’s not necessarily that we can’t know everything in all cases (although that’s sometimes the case), we just don’t have the time and/or resources. This is the source of our uncertainty. Our lives, our organizations, our worlds are far too complex. We sometimes think we understand everything, but most of the time this is a gross oversimplification. Often even knowing all the constituent parts within a system is not good enough. You must understand how each component within the system relates to other components and how that relationship affects other components, etc.

Can uncertainty lead to risk? Well, according to the International Organization of Standardization (ISO) it absolutely does, as the definition of risk according to ISO is “the effect of uncertainty.” Even if you don’t agree with this definition (many don’t) you do have to admit that failure to address uncertainty can lead to increased risk. For example, one of the organizational issues in the NASA Columbia disaster was how the organization dealt with anomalies, such as foam breaking off of the orbiter, which was the direct cause of the accident. These anomalies were sources of significant uncertainty and the failure to account for that uncertainty led to increased unmitigated risk, ultimately leading to disaster.

Many organizations we have looked at have highly developed safety management systems built upon a solid foundation of prevention, which is extremely important. However, a big flaw in many of these management systems is that they fail to account for uncertainty. They account for all the things they know about through their prevention programs. But don’t think about those unknown unknowns, the things they don’t know about.

The problem, of course, becomes one of self-delusion - an organization may point to it’s prevention programs and say that it is operating safely, when in fact the unknown unknowns are pushing the organization dangerously close to the edge of a cliff and a major incident is just around the corner.

So how do we deal with uncertainty in our organizations? We must build resilience into our management systems. Organizations and people have a tendency to push boundaries but we must ensure that we don’t let them push too far and always build in extra capacity to be able to adapt to uncertainty. Some ideas for engineering resilience into our managements systems include:
  • Ensuring robust reporting systems are in place to identify weak signals, which may be the only signals we get of impending danger. 
  • Training employees not only how to handle normal operations, but how to handle surprises in the workplace (which includes, but is more than just emergency response training). Your employees will naturally adapt to circumstances, but maybe not in the way you would like them to unless you give them the tools to do so.
  • Ensuring that prevention through design and management of change processes include allowance for that uncertainty. Design in safety factors or buffers and ensure that operators understand why those safety factors or buffers are in place and the consequences for violating them.
  • Consider uncertainty in your risk assessment processes. When assessing an operation, don’t only consider the things that you know are hazards. Are there things you don’t know or aren’t sure of? How do/could they influence the risks you may face?

There are more ideas that you can incorporate, and we certainly want to avoid going around being proverbial Chicken Little, but we also need to understand that there is always a level of uncertainty in our operations. If we build resilience into ours system we may be able to deal with the risks from that uncertainty.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.