In March of 2005 the BP Texas City Refinery experienced one of the worst industrial disasters in the history of the United States where 15 people were killed and hundreds were injured. The accident sent shockwaves through the occupational safety and health world, primarily because it appeared that refineries such as the Texas City Refinery appeared to have such a great safety record (at least in terms of the standard ways people define safety – incident statistics). Just like with other such accidents, we all sought to understand why such a horrible tragedy could occur.
Well, at a very basic level the Texas City Refinery explosion was caused by a violation in procedures. One of the biggest precipitating causal factors was that operators who were starting the equipment after a maintenance shutdown controlled the flow of materials into the process unit manually to keep the level inside the tower (that eventually overflowed) at 9 feet, whereas the procedures for that unit required that the operators maintain flow using an automated system to keep the level in the tower at 6.5 feet. This violation of procedures let to a cascade of events that culminated in the tower filling to the top (about 158 feet) and leading to a release and subsequent explosion.
A pretty cut and dry case of a violation. Intentional unsafe behavior led to the accident. Name, blame, shame, and retrain.
Of course it’s never that simple. If we take a step back and recognize that these operators were likely not stupid and that they likely didn’t want to die or get anyone else killed we can a much clearer (and more interesting) picture of what went on here.
So first – why did the operators knowingly choose to violate a procedure? Well, let’s first identify two criteria that are necessary and sufficient for someone to knowingly violate a procedure or rule. First, the person has to believe that no significant punishment will come to them. This punishment can be formal, such as in discipline, or informal, such as the procedure violation leading to an accident. So the operators in this case believed that if they violated the procedure that there was no significant risk of being disciplined and there was no significant risk of an accident happening. Did they not understand that overflowing the tower would be a significant risk? Of course they did. But we have to understand that there’s a large difference between 9 feet and 158 feet. By raising the level a few feet what could the harm be?
Second, to choose to violate a procedure the person has to believe that the violation of the procedure will lead to some positive outcome. What was the potential positive outcome here? The operators frequently violated the procedure because doing so would allow them to avoid the risk of damaging sensitive equipment in the process and start the overall process more efficiently. Why would they do this? Because the procedures didn’t match their reality and because they are rewarded for efficiency and production.
So what’s the real picture we see here? We see the operators responding to the pressures of the organization, poorly written procedures, an abstract and poorly defined concept of the risks they face, and a poorly designed and maintained system. The interesting thing is that despite these poor conditions our workers usually find ways to achieve success.
Obviously some of the deeper system issues include looking at the goals of the organization, which is very important in this case. But how can we help the operators on the ground make better decisions in these contexts in a meaningful way? The learning from Texas City is that operators are not robots that you can program to act in a certain way. Humans take in information from multiple sources and balance that information in order to come up with actions that they feel make the most sense to them in the moment. This process is extremely important and is very effective most of the time (people usually don’t fail, they usually succeed). Our job is to help our employees make better decisions in those moments in two key ways:
Make sure employees have a clear picture of reality and the risks they are facing. Through a combination of poor maintenance (e.g. high level alarms failing) and poor design decisions (e.g. level on the tower indicators did not read levels above 9 feet). If the operators in Texas City were able to know exactly how much product was in the tower they may have been able to stop the cascade of events from happening. But not having any idea how much product was actually in the tower created a situation where the mental model that operators had did not match reality, leading to further mistakes. Allowing employees a better picture of the risks involved in an operation, in a meaningful way, will give employees the tools they need to make better decisions in those moments when they need to choose between production and safety. If the employee perceives the risks as high enough because we’ve provided them with a clear and accurate picture of the situation then they are far more likely to make the safe decision.
Make mistakes and errors forgiving. A simple question – why was the tower even designed to allow the inflow of product that would fill the whole tower? Couldn’t they have designed the system so that, similar to your bathtub, at a certain level the product flows into an overflow tank?
So the next time we see a procedure violation don’t let your first reaction be to blame the employee. What is the context in which the violation occurred? What are the competing goals and realities that the employees had to face? Was the broken rule or violated procedure consistent with the reality of the work to be done? How were the risks of a violation made obvious or hidden from them? Answering these questions can go a long way to not only making your employees safe, but also maximizing human performance in your organization.