Showing posts with label TMI. Show all posts
Showing posts with label TMI. Show all posts

Tuesday, April 26, 2016

A Professor's Essay on Nuclear Safety Culture

Prof. Najmedin Meshkati recently published an article* that reviews how the Chernobyl and Fukushima disasters demonstrated the essential need for a strong safety culture (SC) in the nuclear industry.  The article is summarized below.

He begins by reminding us the root cause of the Chernobyl accident was a deficient SC, a problem that affected not only the Chernobyl plant but also permeated the entire Soviet nuclear ecosystem. 

Fukushima is characterized as an anthropogenic accident, i.e, caused by human action or inaction.  He contrasts the fate of TEPCO’s Fukushima Daiichi plant with the Tohoku Electric Power Company’s Onagawa plant.  Onagawa was closer to the earthquake epicenter than Fukushima and faced a taller tsunami but shut down safely and with limited damage.  The author concludes Tohoku had a stronger SC than TEPCO.  We reviewed Meshkati’s earlier paper comparing TEPCO and Tohoku on March 19, 2014.

He also mentions the 1961 SL-1 reactor accident** and the 1979 TMI accident.  Both presented the opportunity for SC lessons learned but they were obviously not taken to heart by all industry participants.

The author concludes with a cautionary note to newly expanding nuclear countries: human factors and SC are critical success factors “and operators’ individual mindfulness and improvisation potential need to be nurtured and cultivated by the organizations that operate such systems; and regulatory regimes should envision, encourage, and enforce them.”

Our Perspective

There is nothing new here.  The article reads like a reasonably well-researched paper prepared for a college senior seminar, with multiple linked references.***  Meshkati does have the advantage of having been “on the ground” at both Chernobyl and Fukushima but that experience does not inform this article beyond adding a bit of color to his description of the Chernobyl sarcophagus (a “temple of eternal doom”).  Overall, the article does not provide new information or insights for Safetymatters readers who have examined the accidents in any level of detail.

What’s interesting is the platform on which the article appeared.  The WorldPost is produced by The Huffington Post, a politically liberal news and opinion website, and the Berggruen Institute, a political and social think tank.  We would not have expected the HuffPost to be associated with an article that exhibits any faint pro-nuclear flavor, even one as vanilla as this.

We don’t celebrate the anniversaries of Chernobyl and Fukushima but we should certainly remember the events, especially when we see the nuclear industry hubris meter trending toward the red zone.


*  N. Meshkati, “Chernobyl’s 30th Anniversary (and Fukushima’s 5th): A Tale of Preventable Nuclear Accidents and the Vital Role of Safety Culture,” The WorldPost (April 22, 2016).

**  Stationary Low-Power Reactor Number One (SL-1) was a U.S. Army prototype small power reactor.  A Jan. 3, 1961 accident killed its three operators.

***  I looked at all the links but didn’t see anything new for the “must read” list.  However, you might quickly check them out if you are interested in these significant historical events.

Monday, November 3, 2014

A Life In Error by James Reason



Most of us associate psychologist James Reason with the “Swiss Cheese Model” of defense in depth or possibly the notion of a “just culture.”  But his career has been more than two ideas, he has literally spent his professional life studying errors, their causes and contexts.  A Life In Error* is an academic memoir, recounting his study of errors starting with the individual and ending up with the organization (the “system”) including its safety culture (SC).  This post summarizes relevant portions of the book and provides our perspective.  It is going to read like a sub-titled movie on fast-forward but there are a lot of particulars packed in this short (124 pgs.) book. 

Slips and Mistakes 

People make plans and take action, consequences follow.  Errors occur when the intended goals are not achieved.  The plan may be adequate but the execution faulty because of slips (absent-mindedness) or trips (clumsy actions).  A plan that was inadequate to begin with is a mistake which is usually more subtle than a slip, and may go undetected for long periods of time if no obviously bad consequences occur. (pp. 10-12)  A mistake is a creation of higher-level mental activity than a slip.  Both slips and mistakes can take “strong but wrong” forms, where schema** that were effective in prior situations are selected even though they are not appropriate in the current situation.

Absent-minded slips can occur from misapplied competence where a planned routine is sidetracked into an unplanned one.  Such diversions can occur, for instance, when one’s attention is unexpectedly diverted just as one reaches a decision point and multiple schema are both available and actively vying to be chosen. (pp. 21-25)  Reason’s recipe for absent-minded errors is one part cognitive under-specification, e.g., insufficient knowledge, and one part the existence of an inappropriate response primed by prior, recent use and the situational conditions. (p. 49) 

Planning Biases 

The planning activity is subject to multiple biases.  An individual planner’s database may be incomplete or shaped by past experiences rather than future uncertainties, with greater emphasis on past successes than failures.  Planners can underestimate the influence of chance, overweight data that is emotionally charged, be overly influenced by their theories, misinterpret sample data or miss covariations, suffer hindsight bias or be overconfident.***  Once a plan is prepared, planners may focus only on confirmatory data and are usually resistant to changing the plan.  Planning in a group is subject to “groupthink” problems including overconfidence, rationalization, self-censorship and an illusion of unanimity.  (pp. 56-62)

Errors and Violations 

Violations are deliberate acts to break rules or procedures, although bad outcomes are not generally intended.  Violations arise from various motivational factors including the organizational culture.  Types of violations include corner-cutting to avoid clumsy procedures, necessary violations to get the job done because the procedures are unworkable, adjustments to satisfy conflicting goals and one-off actions (such as turning off a safety system) when faced with exceptional circumstances.  Violators perform a type of cost:benefit analysis biased by the fact that benefits are likely immediate while costs, if they occur, are usually somewhere in the future.  In Reason’s view, the proper course for the organization is to increase the perceived benefits of compliance not increase the costs (penalties) for violations.  (There is a hint of the “just culture” here.) 

Organizational Accidents 

Major accidents (TMI, Chernobyl, Challenger) have three common characteristics: contributing factors that were latent in the system, multiple levels of defense, and an unforeseen combination of latent factors and active failures (errors and/or violations) that defeated the defenses.  This is the well-known Swiss Cheese Model with the active failures opening short-lived holes and latent failures creating longer-lasting but unperceived holes.

Organizational accidents are low frequency, high severity events with causes that may date back years.  In contrast, individual accidents are more frequent but have limited consequences; they arise from slips, trips and lapses.  This is why organizations can have a good industrial accident record while they are on the road to a large-scale disaster, e.g., BP at Texas City. 

Organizational Culture 

Certain industries, including nuclear power, have defenses-in-depth distributed throughout the system but are vulnerable to something that is equally widespread.  According to Reason, “The most likely candidate is safety culture.  It can affect all elements in a system for good or ill.” (p. 81)  An inadequate SC can undermine the Swiss Cheese Model: there will be more active failures at the “sharp end”; more latent conditions created and sustained by management actions and policies, e.g., poor maintenance, inadequate equipment or downgrading training; and the organization will be reluctant to deal proactively with known problems. (pp. 82-83)

Reason describes a “cluster of organizational pathologies” that make an adverse system event more likely: blaming sharp-end operators, denying the existence of systemic inadequacies, and a narrow pursuit of production and financial goals.  He goes on to list some of the drivers of blame and denial.  The list includes: accepting human error as the root cause of an event; the hindsight bias; evaluating prior decisions based on their outcomes; shooting the messenger; belief in a bad apple but not a bad barrel (the system); failure to learn; a climate of silence; workarounds that compensate for systemic inadequacies’ and normalization of deviance.  (pp. 86-92)  Whew! 

Our Perspective 

Reason teaches us that the essence of understanding errors is nuance.  At one end of the spectrum, some errors are totally under the purview of the individual, at the other end they reside in the realm of the system.  The biases and issues described by Reason are familiar to Safetymatters readers and echo in the work of Dekker, Hollnagel, Kahneman and others.  We have been pounding the drum for a long time on the inadequacies of safety analyses that ignore systemic issues and corrective actions that are limited to individuals (e.g., more training and oversight, improved procedures and clearer expectations).

The book is densely packed with the work of a career.  One could easily use the contents to develop a SC assessment or self-assessment.  We did not report on the chapters covering research into absent-mindedness, Freud and medical errors (Reason’s current interest) but they are certainly worth reading.

Reason says this book is his valedictory: “I have nothing new to say and I’m well past my prime.” (p. 122)  We hope not.


*  J. Reason, A Life In Error: From Little Slips to Big Disasters (Burlington, VT: Ashgate, 2013).

**  Knowledge structures in long-term memory. (p. 24)

***  This will ring familiar to readers of Daniel Kahneman.  See our Dec. 18, 2013 post on Kahneman’s Thinking, Fast and Slow.

Tuesday, January 21, 2014

Lessons Learned from “Lessons Learned”: The Evolution of Nuclear Power Safety after Accidents and Near-Accidents by Blandford and May

This publication appeared on a nuclear safety online discussion board.*  It is a high-level review of significant commercial nuclear industry incidents and the subsequent development and implementation of related lessons learned.  This post summarizes and evaluates the document then focuses on its treatment of nuclear safety culture (SC). 

The authors cover Three Mile Island (1979), Chernobyl (1986), Le Blayais [France] plant flooding (1999), Davis-Besse (2002), U.S. Northeast Blackout (2003) and Fukushima-Daiichi (2011).  There is a summary of each incident followed by the major lessons learned, usually gleaned from official reports on the incident. 

Some lessons learned led to significant changes in the nuclear industry, other lessons learned were incompletely implemented or simply ignored.  In the first category, the creation of INPO (Institute of Nuclear Power Operations) after TMI was a major change.**  On the other hand, lessons learned from Chernobyl were incompletely implemented, e.g., WANO (World Association of Nuclear Operators, a putative “global INPO”) was created but it has no real authority over operators.  Fukushima lessons learned have focused on design, communication, accident response and regulatory deficiencies; implementation of any changes remains a work in progress.

The authors echo some concerns we have raised elsewhere on this blog.  For example, they note “the likelihood of a rare external event at some site at some time over the lifetime of a reactor is relatively high.” (p. 16)  And “the industry should look at a much higher probability of problems than is implied in the “once in a thousand years” viewpoint.” (p. 26)  Such cautions are consistent with Taleb's and Dédale's warnings that we have discussed here and here.

The authors also say “Lessons can also be learned from successes.” (p. 3)  We agree.  That's why our recommendation that managers conduct periodic in-depth analyses of plant decisions includes decisions that had good outcomes, in addition to those with poor outcomes.

Arguably the most interesting item in the report is a table that shows deaths attributable to different types of electricity generation.  Death rates range from 161 (per TWh) for coal to 0.04 for nuclear.  Data comes from multiple sources and we made no effort to verify the analysis.***

On Safety Culture

The authors say “. . . a culture of safety must be adopted by all operating entities. For this to occur, the tangible benefits of a safety culture must become clear to operators.” (p. 2, repeated on p. 25)  And “The nuclear power industry has from the start been aware of the need for a strong and continued emphasis on the safety culture, . . .” (p. 24)  That's it for the direct mention of SC.

Such treatment is inexcusably short shrift for SC.  There were obvious, major SC issues at many of the plants the authors discuss.  At Chernobyl, the culture permitted, among other things, testing that violated the station's own safety procedures.  At Davis-Besse, the culture prioritized production over safety—a fact the authors note without acknowledging its SC significance.  The combination of TEPCO's management culture which simply ignored inconvenient facts and their regulator's “see no evil” culture helped turn a significant plant event at Fukushima into an abject disaster.

Our Perspective


It's not clear who the intended audience is for this document.  It was written by two professors under the aegis of the American Academy of Arts and Sciences, an organization that, among other things, “provides authoritative and nonpartisan policy advice to decision-makers in government, academia, and the private sector.”****  While it is a nice little history paper, I can't see it moving the dial in any public policy discussion.  The scholarship in this article is minimal; it presents scant analysis and no new insights.  Its international public policy suggestions are shallow and do not adequately recognize disparate, even oppositional, national interests.  Perhaps you could give it to non-nuclear folks who express interest in the unfavorable events that have occurred in the nuclear industry. 


*  E.D. Blandford and M.M. May, “Lessons Learned from “Lessons Learned”: The Evolution of Nuclear Power Safety after Accidents and Near-Accidents” (Cambridge, MA: American Academy of Arts and Sciences, 2012).  Thanks to Madalina Tronea for publicizing this article on the LinkedIn Nuclear Safety group discussion board.  Dr. Tronea is the group's founder/moderator.

**  This publication is a valentine for INPO and, to a lesser extent, the U.S. nuclear navy.  INPO is hailed as “extraordinarily effective” (p. 12) and “a well-balanced combination of transparency and privacy; . . .” (p. 25)

***  It is the only content that demonstrates original analysis by the authors.

****  American Academy of Arts and Sciences website (retrieved Jan. 20, 2014).

Thursday, August 29, 2013

Normal Accidents by Charles Perrow

This book*, originally published in 1984, is a regular reference for authors writing about complex socio-technical systems.**  Perrow's model for classifying such systems is intuitively appealing; it appears to reflect the reality of complexity without forcing the reader to digest a deliberately abstruse academic construct.  We will briefly describe the model then spend most of our space discussing our problems with Perrow's inferences and assertions, focusing on nuclear power.  

The Model

The model is a 2x2 matrix with axes of coupling and interactions.  Not surprisingly, it is called the Interaction/Coupling (IC) chart.

“Coupling” refers to the amount of slack, buffer or give between two items in a system.  Loosely coupled systems can accommodate shocks, failures and pressures without destabilizing.  Tightly coupled systems have a higher risk of disastrous failure because their processes are more time-dependent, with invariant sequences and a single way of achieving the production goal, and have little slack. (pp. 89-94)

“Interactions” may be linear or complex.  Linear interactions are between a system component and one or more other components that immediately precede or follow it in the production sequence.  These interactions are familiar and, if something unplanned occurs, the results are easily visible.  Complex interactions are between a system component and one or more other components outside the normal production sequence.  If unfamiliar, unplanned or unexpected sequences occur, the results may not be visible or immediately comprehensible. (pp. 77-78)

Nuclear plants have the tightest coupling and most complex interactions of the two dozen systems Perrow shows on the I/C chart, a population that included chemical plants, space missions and nuclear weapons accidents. (p. 97)

Perrow on Nuclear Power

Let's get one thing out of the way immediately: Normal Accidents is an anti-nuke screed.  Perrow started the book in 1979 and it was published in 1984.  He was motivated to write the book by the TMI accident and it obviously colored his forecast for the industry.  He reviews the TMI accident in detail, then describes nuclear industry characteristics and incidents at other plants, all of which paint an unfavorable portrait of the industry.  He concludes: “We have not had more serious accidents of the scope of Three Mile Island simply because we have not given them enough time to appear.” (p. 60, emphasis added)  While he is concerned with design, construction and operating problems, his primary fear is “the potential for unexpected interactions of small failures in that system that makes it prone to the system accident.” (p. 61)   

Why has his prediction of such serious accidents not come to pass, at least in the U.S.?

Our Perspective on Normal Accidents

We have several issues with this book and the author's “analysis.”

Nuclear is not as complex as Perrow asserts 


There is no question that the U.S. nuclear industry grew quickly, with upsized plants and utilities specifying custom design combinations (in other words, limited standardization).  The utilities were focused on meeting significant load growth forecasts and saw nuclear baseload capacity as an efficient way to produce electric power.  However, actually operating a large nuclear plant was probably more complex than the utilities realized.  But not any more.  Learning curve effects, more detailed procedures and improved analytic methods are a few of the factors that led to a greater knowledge basis for plant decision making.  The serious operational issues at the “problem plants” (circa 1997) forced operators to confront the reality that identifying and permanently resolving plant problems was necessary for survival.  This era also saw the beginning of industry consolidation, with major operators applying best methods throughout their fleets.  All of these changes have led to our view that nuclear plants are certainly complicated but no longer complex and haven't been for some time.    

This is a good place to point out that Perrow's designation of nuclear plants as the most complex and tightest coupled systems he evaluated has no basis in any real science.  In his own words, “The placement of systems [on the interaction/coupling chart] is based entirely on subjective judgments on my part; at present there is no reliable way to measure these two variables, interaction and coupling.” (p. 96)

System failures with incomprehensible consequences are not the primary problem in the nuclear industry

The 1986 Chernobyl disaster was arguably a system failure: poor plant design, personnel non-compliance with rules and a deficient safety culture.  It was a serious accident but not a catastrophe.*** 

But other significant industry events have not arisen from interactions deep within the system; they have come from negligence, hubris, incompetence or selective ignorance.  For example, Fukushima was overwhelmed by a tsunami that was known to be possible but was ignored by the owners.  At Davis-Besse, personnel ignored increasingly stronger signals of a nascent problem but managers argued that in-depth investigation could wait until the next outage (production trumps safety) and the NRC agreed (with no solid justification).  

Important system dynamics are ignored 


Perrow has some recognition of what a system is and how threats can arise within it: “. . . it is the way the parts fit together, interact, that is important.  The dangerous accidents lie in the system, not in the components.” (p. 351)  However, he is/was focused on interactions and couplings as they currently exist.  But a socio-technical system is constantly changing (evolving, learning) in response to internal and external stimuli.  Internal stimuli include management decisions and the reactions to performance feedback signals; external stimuli include environmental demands, constraints, threats and opportunities.  Complacency and normalization of deviance can seep in but systems can also bolster their defenses and become more robust and resilient.****  It would be a stretch to say that nuclear power has always learned from its mistakes (especially if they occur at someone else's plant) but steps have been taken to make operations less complex. 

My own bias is Perrow doesn't really appreciate the technical side of a socio-technical system.  He recounts incidents in great detail, but not at great depth and is often recounting the work of others.  Although he claims the book is about technology (the socio side, aka culture, is never mentioned), the fact remains that he is not an engineer or physicist; he is a sociologist.

Conclusion

Notwithstanding all my carping, this is a significant book.  It is highly readable.  Perrow's discussion of accidents, incidents and issues in various contexts, including petrochemical plants, air transport, marine shipping and space exploration, is fascinating reading.  His interaction/coupling chart is a useful mental model to help grasp relative system complexity although one must be careful about over-inferring from such a simple representation.

There are some useful suggestions, e.g., establishing an anonymous reporting system, similar to the one used in the air transport industry, for nuclear near-misses. (p. 169)  There is a good discussion of decentralization vs centralization in nuclear plant organizations. (pp. 334-5)  But he says that neither is best all the time, which he considers a contradiction.  The possibility of contingency management, i.e., using a decentralized approach for normal times and tightening up during challenging conditions, is regarded as infeasible.

Ultimately, he includes nuclear power with “systems that are hopeless and should be abandoned because the inevitable risks outweigh any reasonable benefits . . .” (p. 304)*****  As further support for this conclusion, he reviews three different ways of evaluating the world: absolute, bounded and social rationality.  Absolute rationality is the province of experts; bounded rationality recognizes resource and cognitive limitations in the search for solutions.  But Perrow favors social rationality (which we might unkindly call crowdsourced opinions) because it is the most democratic and, not coincidentally, he can cite a study that shows an industry's “dread risk” is highly correlated with its position on the I/C chart. (p. 326)  In other words, if lots of people are fearful of nuclear power, no matter how unreasonable those fears are, that is further evidence to shut it down.

The 1999 edition of Normal Accidents has an Afterword that updates the original version.  Perrow continues to condemn nuclear power but without much new data.  Much of his disapprobation is directed at the petrochemical industry.  He highlights writers who have advanced his ideas and also presents his (dis)agreements with high reliability theory and Vaughn's interpretation of the Challenger accident.

You don't need this book in your library but you do need to be aware that it is a foundation stone for the work of many other authors.

 

*  C. Perrow, Normal Accidents: Living with High-Risk Technologies (Princeton Univ. Press, Princeton, NJ: 1999).

**  For example, see Erik Hollnagel, The ETTO Principle: Efficiency-Thoroughness Trade-Off (reviewed here); Woods, Dekker et al, Behind Human Error (reviewed here); and Weick and Sutcliffe, Managing the Unexpected: Resilient Performance in an Age of Uncertainty (reviewed here).  It's ironic that Perrow set out to write a readable book without references to the “sacred texts” (p. 11) but it appears Normal Accidents has become one.

***  Perrow's criteria for catastrophe appear to be: “kill many people, irradiate others, and poison some acres of land.” (p. 348)  While any death is a tragedy, reputable Chernobyl studies report fewer than 100 deaths from radiation and project 4,000 radiation-induced cancers in a population of 600,000 people who were exposed.  The same population is expected to suffer 100,000 cancer deaths from all other causes.  Approximately 40,000 square miles of land was significantly contaminated.  Data from Chernobyl Forum, "Chernobyl's Legacy: Health, Environmental and Socio-Economic Impacts" 2nd rev. ed.  Retrieved Aug. 27, 2013.  Wikipedia, “Chernobyl disaster.”  Retrieved Aug. 27, 2013.

In his 1999 Afterword to Normal Accidents, Perrow mentions Chernobyl in passing and his comments suggest he does not consider it a catastrophe but could have been had the wind blown the radioactive materials over the city of Kiev.

****  A truly complex system can drift into failure (Dekker) or experience incidents from performance excursions outside the safety boundaries (Hollnagel).

*****  It's not just nuclear power, Perrow also supports unilateral nuclear disarmament. (p. 347)