P.E., CFSE, Global Functional Safety Consultant, aeSolutions
Have we really learned anything from previous accidents?
There have been many well publicized process industry accidents over the last several decades. Articles and books have been written about many of these events. However, most books essentially reviewed what happened the day of the particular accident. As interesting as that information might be, simply publicizing that has not helped industry significantly reduce the number of accidents.
Engineers are trained in a classical style of thinking emphasizing Newtonian cause and effect. If there was an effect (i.e., an accident), there must have been a cause. We then hunt for the broken part, or the individual who made an error, and place blame. The more serious the event, the more blame there must be. Accidents, at least in the process industry, are much too complex to place blame in such a manner. In fact, doing so could actually be considered unethical. We need to understand why people were doing what they were at the time (a.k.a. local rationality), how things evolved over time, and how such “normalization of deviance” became the new normal. The accident in Bhopal, India in 1984 is a classic example; the event was more than 5 years in the making.
What process safety management and Jenga have in common
Think of the 14 parts and subparts of the OSHA Process Safety Management regulation as a Jenga tower/game with 14 levels. How many people and plants truly believe they have all the pieces in place, or that they are all 100% effective? Perhaps your facility is more like a tower with a quarter of the pieces missing. What’s deceptive is that such a tower is still standing. Everyone then naturally assumes they must be OK. Yet even a child would realize the tower is not as strong or as resilient as before. Might we be able to measure, visualize, or judge the resiliency of the tower and determine if it were getting close to toppling?
Process plants obviously aren’t as simple as a Jenga tower. Organizations are complex, messy, and usually have conflicting goals. For example, NASAs motto was actually “Faster, Better, Cheaper”. As wonderful as that may initially sound, can any organization actually be all three? How much can you sacrifice one goal to achieve the other two? Such decisions are rarely clear. Jens Rasmussen wrote how complex systems are bounded by three constraints; economic, workflow, and safety. Imagine them as slightly overlapping circles. Beyond the economic boundary, the system cannot sustain itself financially. Beyond the workload boundary, the people or technology cannot perform the tasks they are supposed to. Beyond the safety boundary, systems will functionally fail.
The goal is to remain within the space bound by all three. Yet the three areas may overlap a lot, resulting in a lot of slack (i.e., a resilient organization), or they may overlap very little, meaning even a small change can trigger a major breakdown (i.e., a brittle organization). Keep in mind that the circles, and your location within them, move over time. Organizations are never static, they drift. And as Sidney Dekker wrote, they may drift into failure.
How control theory could help
Control theory could be used to help determine if a facility were inside the ‘sweet spot’ or not. It could even be possible to visualize this, show the values changing over time, and provide it as a tool to management. A partial list of items to track could include the following:
- Safety factors: # of safety loops in bypass, # of safety loops past proof test intervals, % of actual failure rates greater than assumed rates, % of demand rates on safety functions higher than assumed, # of near hits, # of toxic and combustible gas releases, etc.
- Workload factors: Staff vacancies, maintenance backlog, % of hazard assessment recommendations not closed out, known reliability problems, not meeting projected equipment efficiencies, not meeting projected production quotas, etc.
- Economic factors: Meeting budget or not, making a profit or not, etc.
If the above factors are not monitored regularly, a one-time compliance audit could at least yield similar results at that particular point in time. However, the ‘tipping point’ of the Jenga tower cannot be clearly predicted. Just as it took doctors years of research to determine healthy vs. unhealthy levels of cholesterol and blood sugar, it will either take years to determine such factors in the process industry, or management will simply have to decide what percentage of missing PSM Jenga pieces they are willing to tolerate.
Additional ways of preventing accidents
Diversity of thought and experience. High reliability organizations (e.g., aircraft carriers, nuclear power plants) consist of people and groups who have the authority, credibility and courage to stand up and say “no”. Managers should avoid being surrounded by ‘yes’ men, and bad news needs to be able to go up within an organization. Utilizing trusted outside consultants for some important decisions would be one way of imparting diversity.
Utilize Professional Engineers. Not all Engineers are licensed; the “industrial exemption” still remains. Licensed Engineers are held to a higher standard of professionalism and ethics. When there’s a catastrophic accident, there are often calls to change regulations and demand that Professional Engineers (PEs) be involved in, or at least oversee, all such future work. While PEs are obviously not infallible, their use would represent a helpful level of diversity of thought and experience.
Involve outsiders in management of change activities. Effective management of change is required by regulations worldwide. Most accidents were not the result of a single, simple change. Bhopal was a classic example of many changes, made over many years, by many different people, all operating with the best of intentions. Again, outsiders may provide a fresh perspective of changes. Outsiders may think very differently about a proposed change than insiders might, and they may help insiders calibrate what is “normal” in the rest of industry.
Utilize outsiders for assessments. While a one-time assessment can’t reveal everything about a complex system/organization (and the potential result of all the possible interactions), it can help an organization realize how off-center they might actually be.
For the full paper and listing of recommendations and conclusions: