Thursday, September 1, 2016

Where Does the Failure Come From?

This is a rather basic question, but rather profound if you think about it. Where does the failure come from? For the safety professional, identifying the sources of failure in organizations, the causes of accidents, is crucial. The answer to this question guides where we spend our time, what problems we look for and what solutions we choose.

To illustrate this, from our perspective, there are two basic viewpoints on what the answer is to the question where does the failure come from? For the sake of simplicity, we will just call them the Broken Parts and the Functioning System perspectives. Each perspective has its own assumptions as to what causes accidents, what causes safety, where the problems in organizations are and what the safety professional should do about it. We will discuss each perspective in kind.

The Failure Comes from Broken Parts

The first perspective, the perspective held by many in the safety profession, is that when an organization experiences failure, such as an accident, the failure came because something broke or failed. Everything was working fine, but then something or someone screwed up and that caused the failure. So, for example, an employee is injured when she put her arm into a piece of equipment and is partially pulled in. The failure from the perspective of the Broken Parts camp comes from erratic employees. The employee should not have put her arm in the machine to begin with. The broken part in this case is the employee, usually in their decision-making processes. If it weren’t for this then the accident would not have happened.

What causes accidents and what causes safety?

From the Broken Parts perspective, accidents are caused when something in the organization deviates from the intended design. This could be from equipment that fails, but most likely is from a person deviating from the rules or procedures put in place.

Conversely then, safety is created when everything and everyone follows the plan. It is only when we deviate that accidents happen. The organization itself is basically safe. It is only through deviation that unsafety creeps in.

Where do we look?

Following from the above, when an accident happens, all we have to do is look for those parts of our organization that deviated from the intended design. This is the so-called “root cause” or “root causes”.

If we want to prevent accidents, all we need to do is prevent deviations and variability in performance. We want things to be standard and uniform. Only when things begin to vary in their performance do we have problems.

What do we fix?

Again, it follows the above that our job is to find those parts that are broken or in the process of breaking and either fix them or replace them. Again, typically the issue is one of human behavior, because people are typically less reliable than machines. So we need to fix the individual’s behavior through typical behavior interventions, up to and including termination. Often we don’t go that far though, so we just implement more controls in the form of rules, policies, procedures, observations, audits, etc. If we decide to get more “enlightened” then we look for opportunities to engineer the human out of the system entirely through automation.

The Failure Comes from the Functioning System

From the second perspective, one not held by many in the safety profession so far, is that failure results from how the system normally functions. Organizations are always working in a fluid, imperfect, resource constrained world, which forces them to balance competing goals. This process of balancing is remarkably successful…until it’s not. Essentially failure and success have the same causes. Using the example from above, the employee put her arm in the machine not because of a disregard for safety rules, but because there was no other way to do the task available to her. The organization simply couldn’t shut the machine down to do the task the employee was doing without cutting their production almost in half. Further, this task was routinely done, multiple times a day, around the clock, every day without incident. So the day her arm was pulled into the machine was a day like any other…except today the things that normally happened came together in abnormal ways to create the accident.

What causes accidents and what causes safety?

In the Functioning System perspective, accidents are an unintended consequence of normal performance variability. Put another way, accidents are an outcome that was designed into your system (as David Woods says, systems work as designed, but not as intended). People have to make trade-offs in order to function in an imperfect, complex and resource constrained world. In such an environment, deviations and variability in performance are normal and often required in order to get the job done. This does not make them right, but on days where you have accidents and days where you have none, you will have both deviations and variability.

Safety is created in organizations not when we force them to meet an unrealistic standard, but when we help facilitate successful performance. By assisting people in making better trade-offs, smarter adaptations and designing systems that work with people rather than constrain them we create expertise and safety.

Where do we look?

Following from the above, when an accident happens it provides us an opportunity to see how our system produced an outcome that we didn’t expect or intend and change that system. Essentially we are looking for how the system normally functions and why that functioning led to this negative outcome. This will tell us where the opportunities for improvement are.

Before accidents happen, because success and failure have the same cause, it makes no sense to wait for an accident to happen. For the Functioning System perspective, similar things happen on both the days you do and do not have accidents. So you can learn just as much on days you have no accidents as you can on days you do not have accidents. Looking for those parts in your system where work becomes difficult, where people have to overcome things in order to get work done will help you find where risk is creeping into your system and where you can make work easier to get done.

What do we fix?

Finally, in the Functioning System perspective, doing things like focusing on any individual to improve things in the future makes no sense. People don’t fail like machines do, so blame seems a bit nonsensical in this light. But we obviously want less accidents and better performance in the future. So we look for ways to make it easier for people to accomplish goals, make sure that the proper resources are readily available (from the perspective of the worker, not your perspective) and to find ways to streamline workflows in a way that makes success possible in varying conditions (i.e., resilient). We are more interested in facilitating performance rather than constraining it and harnessing the ability of people to adapt to their circumstances to achieve success. What this looks like will obviously vary depending upon the context of the work.

What’s Your Perspective?

In safety we often do not reflect on our worldview or mental models and how those can guide us down a path to where certain problems and solutions seem obvious and others seem crazy. We think it’s probably a good idea to take a step back every now and again to identify what your perspective is and ask whether it’s leading you in the direction you’re happy with. Many times, these worldviews are constructed in such a way that it’s very hard to identify the flaws in them from the inside. It’s only when we step outside ourselves, often with the help of others, that we see them. But we think it’s well worth it.

Clearly from the above, we think that when it comes to the question of where failures come from the Functioning System perspective is better, although it currently is not the most popular one. What do you think? Where does the failure come from?


  1. The failure comes from not managing the balance competing goals...the failure is risk management.

    If the goal is to make money, then there is no failure if one dies obtaining this goal...if the goal is fast production..then there is no failure in pressure...if the goal is safety and then production takes its place...then there is a failure. If the goal is equal safety and production...then both will loose under the competing goals of success, and it is a failure.

  2. Hi Ron,

    Reading your blog I see those two perspectives coming from two all too familiar different mindsets.

    More than three decades ago, after serving as a production manager in a chemical plant for some twelve years, I was appointed as the new Safety Officer of that particular plant.
    As a line manager I attempted in those twelve years to change the mindset of top management, who was clearly what you call a ‘broken part mindset’, and what I call the ‘Heinrich mindset’, without success. The only thing I could do was to use in my division , what you call, “the Functioning System mindset’ and what I call the ‘Bird mindset’.

    Since it was custom in those days that the Safety Officer ‘investigated’ all serious accidents that occurred in the plant, I saw an opportunity when a serious accident happened when I had just taken the Safety ‘responsibility’. I used for the investigation of that accident, not the domino theory of Bird, but MORT of Bill Johnson. MORT incorporates the uniqueness that you can use both perspectives: the broken part AND the Functioning System. I proved using MORT that the System didn’t function AND that the broken parts were not only the behavior of the worker but even more the behavior of management and most of all due to the ‘oversight and risk’ mindset of top management.

    Do I have to say that it was the last time I was able to use MORT?

    Johan Roels


Note: Only a member of this blog may post a comment.