Wednesday, April 29, 2015

“There Are Lies…” – The Problem With Accident Cause Statistics

In 1931, Herbert William Heinrich released his seminal book “Industrial Accident Prevention – A Scientific Approach”. For safety professionals this book and its subsequent revisions has provided a foundation for many of the models and assumptions we put into practice. For example, the idea of direct versus indirect costs of accidents came from Heinrich. The triangle showing a relationship between the number of near-miss events and serious accidents? That’s Heinrich too. He’s the one that gave us the pyramid.

One theory that Heinrich posited that has been particularly influential is Heinrich’s 88-10-2 theory. Heinrich allegedly looked at thousands of accidents and determined that the in 88% of the cases the cause was “unsafe acts”, 10% were attributable to “unsafe conditions”, and 2% were “acts of god”. This provided what seemed like definitive proof of where the main problem in safety is – people. Other statistics related to the human contribution to accidents have come forth from other sources, such as 80%, 96%, and even 100%. The story seems pretty consistent. People are the main issue leading to our accidents.

These findings lead to some obvious conclusions. For example, when there’s an accident the safe money is on the fact that the cause is a person. So look there first. Also, we should spend our time trying to fix the people, since they are clearly the broken part that is leading to most accidents in our organizations. And you can see that the safety profession has taken these conclusions to heart, with most accident investigations identifying human error as the cause and much effort spent in the safety profession on trying to fix people through behavior modification (of one sort or another).

In 1906, in Chapters From My Autobiography, Mark Twain wrote that “there are three kinds of lies: lies, damned lies, and statistics”. The point he was making was that people tend to be fascinated by numbers. They sound authoritative. They resolve ambiguity. In the midst of speculative arguments, someone will bring in the cold, hard truth of statistics and no one can argue against it. The problem is that we have a tendency to believe these numbers without question.

Take the accident cause statistics we cited above. They sound pretty authoritative, don’t they? Especially since they all seem to point in a certain direction (i.e., people being a problem to control). But just because you can quantify data (give it a number) doesn’t mean that the data is objective. What the statistics above do not tell us is about the myriad of assumptions that create a foundation for those statistics.

Consider for example the fact that in each case we’re only told that “unsafe acts” or “human error” are the causes, we aren’t told how those terms are defined. This is not trivial, because if there is no agreed upon definition of these terms then the statistics may be nothing more than someone’s attempt to turn a best guess into a number. And, indeed, this is what we find. There is no real operational definition of “unsafe acts” or “human error”. Often the only time everyone agrees on whether or not an action was “safe” or “unsafe” is after an accident happened, which isn’t really helpful for a profession designed to prevent accidents.

A further assumption that underlies the accident cause statistics is the idea that human behavior can be separated from its environment in a meaningful way. For us to have separate categories for unsafe actions and unsafe conditions, that means that one cannot affect the other (i.e., an action cannot affect a condition and a condition cannot affect an action), or else the water gets muddy and counting becomes impossible. And, indeed, we do find that this assumption does not hold up to scrutiny. Trying to understand a person’s behavior without understanding the context they are in is like trying to understand hand clapping by only studying one hand. People’s behavior affects and is affected by their conditions in profound, often unconscious, ways. So, in a sense, this means the only accurate statement we can make is that all accidents are caused by a unique combination of human-environment interaction. (Of course this statement is not very useful, but facts are not determined by their convenience.)

The final problem with these statistics is not based on an assumption made by those who put forth the statistics. Rather this is an assumption made by people who take these statistics to heart in order to justify modifying people’s behavior. The assumption is that if we determine that a lot of, most of, or all accidents are caused by human behavior (using some definition) that this means that we need to fix the behavior. In a sense, we are equating the “cause” with the “problem to be fixed.” This isn’t necessarily true (and speaks to the problems with our relentless pursuit of “cause” at the expense of learning).

If it is true that behavior is influenced by other things, what if the behavior we’re seeing is only the result of something else. To use a linear analogy, we see the accident as the effect, and we find the cause to be an unsafe act. But wouldn’t the act then also have to have a cause? (Of course, this logic is problematic, because we could trace it all the way back to creation, which speaks to the problems with linear thinking and the idea of root cause). People are hard to change. The organization has only cursory control over them, and safety professionals often have even less control. But the organization has almost total control over the context of work (both physical and social), which has a dramatic effect on behavior.

The bottom line is that saying that behavior is the cause of most accidents is a spurious claim at best, and even if it were true, it is not really that interesting. It is time for safety professionals to abandon this idea that we need to isolate the “root cause” of most accidents. Accidents rarely happen because of one thing. Instead accidents result when normal things come together in abnormal ways. If this is true, our focus should be on how these normal things are coming together, how they relate to one another, rather than on the things themselves. To put this in behavior terms, rather than saying that the accident was caused by an unsafe act, we should instead focus on understanding why the behavior made sense to the person in that moment. Doing this will move safety past the false security of statistics, toward real progress.

Wednesday, April 22, 2015

Whatever Happened To Personal Responsibility?

We were having a conversation with a colleague recently about enforcement policies within organizations. The colleague was asking about our opinion regarding “3 strikes” policies (or any variation thereof), where an employee is progressively disciplined for a violation and after the third violation he/she is automatically terminated. These are similar to zero tolerance policies, where there are certain “golden rules” that if an employee were to violate them they would always be terminated without question. In our view these policies are misguided, because they discourage thinking and looking at the context of the violation. In general, almost any policy that discourages thinking is bad. Those policies that discourage thinking and lead to irrevocable negative consequences for people are particularly bad.

Many times in this blog we have discussed the need to take the emphasis off employee behavior as a cause of accidents (for example, here, here, here, and here). The idea that human error (which includes violations in our book) causes accident and if we can only get people to follow the plan then everything will be ok is problematic, at best, and foolish, at worst. The response we get is often favorable, but there are some who are highly critical of our line of reasoning. Often these folks point out that we’re removing personal responsibility from the equation, that we’re letting people off the hook unjustly. We are criticized as being unrealistic and perhaps just a bit too idealistic.

Here’s the thing though – even though the discussions surrounding the new view of human error and human performance often ends up revolving around ideas of justice and personal responsibility, these are really side issues. Sure they’re important, but we aren’t lawyers. We are safety professionals. So justice is less important to us than making sure that we take steps to make our organizations safer. And if our viewpoint is correct, if many of the issues we find in our organizations are not because of bad apples, but due to poorly designed work environments and processes, that means punishing people not only doesn’t fix the problem, it may make things worse! Our position is that if someone violates a rule or makes a mistake this is almost always just a symptom of a bigger problem in the organization. Often the direct issues lie in competing goals, poorly written procedures, or inadequate management. Indirect contributing issues often include a misplaced, unquestioning trust in our ability to plan every aspect of a job.

A big problem we find is that people often don’t consider other alternative explanations for what they find. They accept their first explanation without testing it (what you look for is what you find). Consider what is often a counter-example that is brought to our attention – the case of the repeat violator. Often people point out that when you have one person who routinely violates the rules, and they are the only person that we have problems with, this automatically invalidates our thought process. But couldn’t an alternative explanation be that the only issue with this employee is that they are terrible at hiding their violations, whereas the rest of the workforce is not? Did we investigate that possibility or did we just assume that the problem is localized to that one “bad apple”? Often the truth is more with the latter than the former.

Don’t get us wrong - we are not saying that people are perfect, and we are also not saying that there is never a time where it would be appropriate to punish an individual. Clearly there are times when punishment, including termination, is appropriate. However, this should only be considered when alternative explanations have been considered and dealt with. Furthermore, we firmly believe that these alternative explanations should be considered not by people removed from the work environment, but by people who are familiar with the complexity of how normal work takes place in your organization. Essentially, what we’re saying is that punishment shouldn’t be decided by mindless adherence to rules. Rather, people should be judged by a jury of their peers (meaning people who see the world from a similar perspective). We expect no less in society, but for some reason we’re ok with dictatorial and arbitrary exercises of discipline in our organizations. This needs to stop.

Our position is not that personal responsibility should be abandoned. Rather it should be expanded. We should recognize that our role is not to punish violators. We are not police officers. Our job is safety, so what we are doing should help the people in our organizations be safe. We need to expand our view following a case of human error/violation from looking only at the individual to looking at the work process and the organization. How are these contributing to the problems we found? Are the problems localized or rampant throughout the organization? How would we know if they were? What can we do to make the rules easier to follow? What do the co-workers of the individual think of the violation?

These are the kinds of questions we should be asking following a rule violation, not simply “how many strikes does this make?”

Wednesday, April 15, 2015

The Map Is Not The Territory – Thinking Differently About Job Safety Planning

During a recent assessment we were doing at a fairly large and complex organization, we were provided a document from the organization that showed a listing of the risks that the organization was aware of and actively managing. One of the listings related to health and safety risks to employees and stated that the source of the risk included the organization not designing a safe workspace and not ensuring that employees followed safety procedures. This makes sense on its face and is a laudable goal overall (especially the part about design). A lot of safety professionals and managers have the mindset that if only we could get employees to follow the safety rules, procedures, plans, etc. that we have laid out for them, we would achieve the results we want.

In 1931, Alfred Korzybski delivered a paper at a conference in New Orleans to the American Association for the Advancement of Science. In the paper Korzybski made the point that “the map is not the territory.” Maps are just abstractions, or models of reality. They are a snapshot with only a finite amount of detail. Any model is inherently wrong, but some are useful, so the saying goes, and in the same way, by reducing the detail of the territory to something that is easy to read, as a map needs to be, it creates inaccuracy. Even if we were somehow able to create a map with perfect accuracy, a true snapshot, our world is constantly changing. So after a time, our perfect map will be inaccurate.

What are job safety plans (by this we include procedures, rules, job plans, job hazard analyses/job safety analyses, critical task analyses, permit systems, etc.), if not “maps” that direct an employee on how to do a job “safely”? All of these are supposed to be tools to help the employee navigate the complexity of the job in a way that helps them avoid injury. But, if we apply Korzybski’s dictum, in the same way that the map is not the territory, our job safety plans are not the job. There will always be abstractions or models of the job to be done. We will always shed some detail to enhance the readability and usability of the plan, and, even if we didn’t shed the detail, jobs change over time, so our plans quickly become inaccurate, if they ever were. Like the best cartographers, the best we can hope for is “accurate enough.”

The problem is that in the safety profession we have somehow taken our plan as gospel and equated it with safety – if we follow the plan we are safe, if we deviate then we are unsafe. We have built our management systems on the principle that the secret to safety is better planning. If we have a failure (e.g. an accident) it’s because either someone deviated from our plan or our planners didn’t do a good enough job (notice the underlying tone of human failure and blame that is woven in that narrative!).

But this all assumes that our plan will encompass every situation that we face, and that is impossible! The old military adage “no plan survives contact with the enemy” is something most have heard of, but in the safety profession we have ignored its lesson. By assuming that if we have a failure it has to be a problem with those executing the plan or those developing the plan, we have ignored an obvious third option – the problem could be our faith in plans.

Now we are not arguing that we should dispense with all forms of job safety planning. The point is that the idea that we can plan for everything is utterly false. If that’s true then deviations from our plan should not surprise us. That doesn’t mean that they are acceptable deviations, but that also doesn’t mean they are necessarily bad. A deviation could just be people having to adapt the plan to reality, the same way someone would have to adapt a map to the territory they are navigating.

So how should we approach planning if plans are always inaccurate? As Dwight Eisenhower said “in preparing for battle, I have always found plans are useless, but planning is indispensible.” Think about that quote for a second – plans are useless, planning is indispensible. Pretty profound, right? What Eisenhower said here is that we should see plans for what they really are – the outcome of a planning process. And it’s the planning process that is useful, not the plan itself. Why? Because the planning process involves people doing what they do best – thinking through their world, solving problems, and collaborating. Conversely, plans lead people to operate at their worst –mindlessly conducting tasks.

In line with this philosophy, here are some tips for enhancing your planning processes:
  • Whenever possible involve people with diverse opinions. You want a healthy mix of technical expertise and real-world experience. At a minimum, always involve people who actually do the work.
  • Identify signals that people doing the work can use to quickly identify that the reality of the work is different than the plan, and give them tools necessary to adapt to those situations (including, but not limited to “stop work” authority).
  • Identify any critical steps in the process (i.e., steps that if done incorrectly would lead to irrevocable and unacceptable harm) and analyze those closely to build in defenses that help operators complete these tasks successfully (e.g., removing distractions, ensuring adequate resources, etc.).
  • Consider implementing debriefing sessions for the task, so you can learn from normal, successful operations. What worked? What didn’t? What surprised you? Etc.

Wednesday, April 1, 2015

“It Can’t Happen Here” – Distancing through Differencing

Recently, while doing research for a project, we happened upon a news article following a train derailment in Spain where a train hit a curve too fast and derailed. In the article an expert in US rail explains why such a train derailment is unlikely to happen in the US. For example, he explains that, particularly in the Northeast of the country, there are a lot of curves and tunnels. This, the expert points out, makes it harder for a train engineer to become “complacent or bored.” Additionally, US trains have an auto-stop feature, which should stop the train if a train engineer misses a signal. So train riders in the US have nothing to fear.

Less than 5 months later, a train operating in the Northeast part of the country derailed, killing four and injuring 67. The cause? An engineer fell asleep and didn’t slow down in time for a curve.

What gives? We thought trains in the US were supposed to be safe.

The problem is not really with the expert’s judgment. Everything he said is essentially true (to a degree). The problem is with a common reaction many people have to accidents. Cook and Woods call this “distancing through differencing.” Essentially, it’s the tendency for people, following a major accident, to look for reasons why the same accident could not happen here. They do this by highlighting how details specific to the accident are not present in their environment. Essentially, they distance themselves from the accident by highlighting how different they are.

Others have called this “playing the ‘it can’t happen here’ bingo card”. John Downer described this phenomenon related to the Fukushima nuclear disaster – the tendency for those conducting risk assessments for the nuclear industry to “disown” the fact that such a disaster calls into question their methods.

This tendency to separate ourselves from the accidents and disasters we hear about is not really surprising. If we admit that the bad accident that happened over there can happen here, it has several unsettling implications. Our world may be more uncertain and scarier than we hope it is. We might have to admit that our own implicit and explicit assessments of our safety are not accurate. We might have to change the way we think and act as a result (oh the horror!).

But this need to believe we’re safe and secure, that all the problems are over there, with those other people, may be making us less safe. We get into a false sense of security because we start to think that the differences are what makes us safe. Take the example of the rail expert above. He listed reasons why, in his opinion, the US rail industry was safer than the one in Spain. Essentially, he posited a theory about rail safety – that more curves, tunnels, and automated stopping mechanisms were what was missing in the Spain accident and therefore must be what’s keeping the US safe. So, instead of looking at the accident in Spain and trying to figure out how we are similar here in the US (e.g. we both have engineers who are humans, subject to significant boredom when doing rote tasks), instead of going out and asking our train engineers if something like that could happen here, and then making adjustments to reduce risk, we decided it couldn’t happen. And then we did nothing.

But we were wrong. It not only could happen, it did happen.

This is a consistent problem we see in the safety profession – believing that something is keeping us safe with very little evidence to back up that assumption. Think of all the things you do as a safety professional. Where is the evidence that what you’re doing actually will achieve the intended results? What if all the things that you think are making you and your workers safe actually aren’t having any effect at all? What if, as Peter Senge says, bad things are happening not in spite of what we’re doing, but because of what we’re doing?

If we play our “it can’t happen here bingo card” when accidents happen in other places we limit learning. It is time for the safety profession to wage war on anything that gets in between us, the organizations we serve, and learning. This idea that we have it all figured out, that we know how our organizations are operating and what they need to do to be safe needs to go the way of the dodo. This means we need to take advantage of any opportunity we have to learn about how systems and people operate. Here are some recommendations to get you started on the learning process:
  • The next time a major accident happens have a discussion with your employees about what happened. Have them identify the similarities and what it would take for a similar accident to happen at your organization.
  • Look at your organization and identify any policies, procedures, or processes that may discourage learning. For example, any policy that ties incentives to accident numbers, or any requirement that puts blame on workers for an accident. Get rid of them. If you’re scared to do this, just try it on an experimental basis. Get rid of them for a few months and see what happens.
  • Most days in your organization go by without an accident. Why? We all have our theories about what’s keeping us safe, but we don’t know for sure until we get out and identify how work is actually happening. Go out and observe normal work. Look at all the things that could go wrong and right. What’s causing those things to happen or not happen? What is really keeping you and your workers safe?