Showing posts with label Chernobyl. Show all posts
Showing posts with label Chernobyl. Show all posts

Tuesday, June 20, 2017

Learning About Nuclear Safety Culture from the Web, Maybe

The Internet  Source:Wikipedia
We’ve come across some Internet content (one website, one article) that purports to inform the reader about nuclear safety culture (NSC).  This post reviews the content and provides our perspective on its value.

NSC Website

It appears the title of this site is “Nuclear Safety Culture”* and the primary target is journalists who want an introduction to NSC concepts, history and issues.  It is a product of a group of European entities.  It is a professional looking site that covers four major topics; we’ll summarize them in some detail to show their wide scope and shallow depth. 

Nuclear Safety Culture covers five sub-topics:

History traces the shift in attitudes toward and protection from ionizing radiation as the possible consequences became better known but the story ends in the 1950s.  Key actions describe the roles of internal and external stakeholders during routine operations and emergency situations.  The focus is on power production although medicine, industrial uses and weapons are also mentioned.  Definition of NSC starts with INSAG (esp. INSAG-4), then adds INPO’s directive to emphasize safety over competing goals, and a familiar list of attributes from the Nuclear Safety Journal.  As usual, there is nothing in the attributes about executive compensation or the importance of a systems view.  IAEA safety principles are self explanatory.  Key scientific concepts cover the units of radiation for dose, intake and exposure.  Some values are shown for typical activities but only one legal limit, for US airport X-rays, is included.**  There is no information in this sub-topic on how much radiation a person can tolerate or the regulatory limits for industrial exposure.

From Events to Accidents has two sub-topics:

From events to accidents describes the 7-level International Nuclear Event Scale (from a minor anomaly to major accident) but the scale itself is not shown.  This is a major omission.  Defence in depth discusses this important concept but provides only one example, the levels of physical protection between a fuel rod in a reactor and the environment outside the containment.

Controversies has two sub-topics:

Strengths and Weaknesses discuss some of the nuclear industry’s issues and characteristics: industry transparency is a double-edge sword, where increased information on events may be used to criticize a plant owner; general radiation protection standards for the industry; uncertainties surrounding the health effects of low radiation doses; the usual nuclear waste issues; technology evolution through generations of reactors; stress tests for European reactors; supply chain realities where a problem anywhere is used against the entire industry; the political climate, focusing on Germany and France; and energy economics that have diminished nuclear’s competitiveness.  Overall, this is a hodgepodge of topics and a B- discussion.  The human factor provides a brief discussion of the “blame culture” and the need for a systemic view, followed by summaries of the Korean and French document falsification events.

Stories summarizes three events: the Brazilian theft of a radioactive source, Chernobyl and Fukushima.  They are all reported in an overly dramatic style although the basic facts are probably correct.

The authors describe what they call the “safety culture breach” for each event.  The problem is they comingle overarching cultural issues, e.g., TEPCO’s overconfident management, with far more specific failures, e.g., violations of safety and security rules, and consequences of weak NSC, e.g., plant design inadequacies.  It makes one wonder if the author(s) of this section have a clear notion of what NSC is.

It isn’t apparent how helpful this site will be for newbie journalists, it is certainly not a complete “toolkit.”  Some topics are presented in an over-simplified manner and others are missing key figures.  In terms of examples, the site emphasizes major accidents (the ultimate trailing indicators) and ignores the small events, normalization of deviance, organizational drift and other dynamics that make up the bulk of daily life in an organization.  Overall, the toolkit looks a bit like a rush job or unedited committee work, e.g., the section on the major accidents is satisfactory but others are incomplete.  Importantly (or perhaps thankfully) the authors offer no original observations or insights with respect to NSC.  It’s worrisome that what the site creators call NSC is often just the safety practices that evolved as the hazards of radiation became better known. 

NSC Article

There is an article on NSC in the online version of Power magazine.  We are not publishing a link to the article because it isn’t very good; it looks more like a high schooler’s Internet-sourced term paper than a thoughtful reference or essay on NSC.

However, like the stopped clock that shows the correct time twice per day, there can be a worthwhile nugget in such an article.  After summarizing a research paper that correlated plants’ performance indicators with assessments of their NSC attributes (which paper we reviewed on Oct. 5, 2014), the author says “There are no established thresholds for determining whether a safety culture is “healthy” or “unhealthy.””  That’s correct.  After NSC assessors consolidate their interviews, focus groups, observations, surveys and document reviews, they always identify some improvement opportunities but the usual overall grade is “pass.”***  There’s no point score, meter or gauge.  Perhaps there should be.

Our Perspective

Don’t waste your time with pap.  Go to primary sources; an excellent starting point is the survey of NSC literature performed by a U.S. National Laboratory (which we reviewed on Feb. 10, 2013.)  Click on our References label to get other possibilities and follow folks who actually know something about NSC, like Safetymatters.

Nuclear Safety Culture was developed as part of the NUSHARE project under the aegis of the European Nuclear Education Network.   Retrieved June 19, 2017.

**  The airport X-ray limit happens to be the same as the amount of radiation emitted by an ordinary banana.

***  A violation of the Safety Conscious Work Environment (SCWE) regulations is quite different.  There it’s zero tolerance and if there’s a credible complaint about actual retaliation for raising a safety issue, the licensee is in deep doo-doo until they convince the regulator they have made the necessary adjustments in the work environment.

Tuesday, April 26, 2016

A Professor's Essay on Nuclear Safety Culture

Prof. Najmedin Meshkati recently published an article* that reviews how the Chernobyl and Fukushima disasters demonstrated the essential need for a strong safety culture (SC) in the nuclear industry.  The article is summarized below.

He begins by reminding us the root cause of the Chernobyl accident was a deficient SC, a problem that affected not only the Chernobyl plant but also permeated the entire Soviet nuclear ecosystem. 

Fukushima is characterized as an anthropogenic accident, i.e, caused by human action or inaction.  He contrasts the fate of TEPCO’s Fukushima Daiichi plant with the Tohoku Electric Power Company’s Onagawa plant.  Onagawa was closer to the earthquake epicenter than Fukushima and faced a taller tsunami but shut down safely and with limited damage.  The author concludes Tohoku had a stronger SC than TEPCO.  We reviewed Meshkati’s earlier paper comparing TEPCO and Tohoku on March 19, 2014.

He also mentions the 1961 SL-1 reactor accident** and the 1979 TMI accident.  Both presented the opportunity for SC lessons learned but they were obviously not taken to heart by all industry participants.

The author concludes with a cautionary note to newly expanding nuclear countries: human factors and SC are critical success factors “and operators’ individual mindfulness and improvisation potential need to be nurtured and cultivated by the organizations that operate such systems; and regulatory regimes should envision, encourage, and enforce them.”

Our Perspective

There is nothing new here.  The article reads like a reasonably well-researched paper prepared for a college senior seminar, with multiple linked references.***  Meshkati does have the advantage of having been “on the ground” at both Chernobyl and Fukushima but that experience does not inform this article beyond adding a bit of color to his description of the Chernobyl sarcophagus (a “temple of eternal doom”).  Overall, the article does not provide new information or insights for Safetymatters readers who have examined the accidents in any level of detail.

What’s interesting is the platform on which the article appeared.  The WorldPost is produced by The Huffington Post, a politically liberal news and opinion website, and the Berggruen Institute, a political and social think tank.  We would not have expected the HuffPost to be associated with an article that exhibits any faint pro-nuclear flavor, even one as vanilla as this.

We don’t celebrate the anniversaries of Chernobyl and Fukushima but we should certainly remember the events, especially when we see the nuclear industry hubris meter trending toward the red zone.

*  N. Meshkati, “Chernobyl’s 30th Anniversary (and Fukushima’s 5th): A Tale of Preventable Nuclear Accidents and the Vital Role of Safety Culture,” The WorldPost (April 22, 2016).

**  Stationary Low-Power Reactor Number One (SL-1) was a U.S. Army prototype small power reactor.  A Jan. 3, 1961 accident killed its three operators.

***  I looked at all the links but didn’t see anything new for the “must read” list.  However, you might quickly check them out if you are interested in these significant historical events.

Friday, October 2, 2015

Training Materials for Teaching NRC Personnel about Safety Culture

This is a companion piece to our Aug. 24, 2015 post on how the NRC effectively regulates licensee safety culture (SC) in the absence of any formal SC regulations.  This post summarizes a set of NRC slides* for training inspectors on SC basics and how to integrate SC information and observations into inspection reports.

The slides begin with an overview of SC, material you’ve seen countless times.  It includes the Chernobyl and Davis-Besse events, the Schein tri-level model and a timeline of SC-related activities at the NRC.

The bulk of the presentation shows how SC is related to and incorporated in the Reactor Oversight Process (ROP).  The starting point is the NRC SC Policy Statement, followed by the Common Language Initiative** which defined 10 SC traits.  The traits are connected to the ROP using 23 SC aspects.  Aspects are “the important characteristics of safety culture which are observable to the NRC staff during inspection and assessment of licensee performance” (p. 13)  Each SC aspect is associated with one of the ROP’s 3 cross-cutting areas: Human Performance (14 aspects), Problem Identification and Resolution (6 aspects) and Safety Conscious Work Environment (3 aspects).  During supplemental and reactive inspections there are an additional 12 SC aspects to be considered.  Each aspect has associated artifacts that indicate the aspect’s presence or absence.  SC aspects can contribute to a cross-cutting theme or, in more serious cases, a substantive cross-cutting issue (SCCI).***

The integration of SC findings into inspection reports is covered in NRC Inspection Manual Chapter 0612 and NINE different NRC Inspection Procedures (IPs). (p. 30)  In practice, the logic chain between a SC aspect and an inspection report is the reverse of the description in the preceding paragraph.  The creation of an inspection report starts with a finding followed by a search for a related SC cross-cutting aspect.  Each finding has one most significant cause and the inspectors should “find the aspect that describes licensee performance that would have prevented or precluded the performance deficiency represented by that cause.” (p. 33)

Our Perspective

This is important stuff.  When NRC inspectors are huddled in their bunker evaluating their data and observations after reviewing your documentation, crawling around your plant and talking with your people, the information in these slides provides the road map for their determination of how one or more alleged SC deficiencies contributed to a performance problem which resulted in an inspection finding.

Think of the SC aspects as pegs on which the inspectors can hang their observations to beef up their theory of why a problem occurred. Under routine conditions, there are 23 pegs; under more stringent inspections, there are 35 pegs.  That’s a lot of pegs and none of them is trivial which means your organization’s response may consume sizable resources.

We’ll finish with a more cheery thought:  If you get to the point where the NRC is going to conduct an independent assessment of your SC, their team will follow the guidance in IP 95003.  But don’t worry about their competence, “IP 95003 inspection teams will receive "just-in-time" training before performing the inspection.” (p. 43)

Bottom line: If it looks like controlling oversight behavior and quacks like a bureaucrat, then it probably is de facto regulation.

*  NRC Training Slides, “Safety Culture Reactor Oversight Process Training” (July 10, 2015).  ADAMS ML15191A253.  The slides include other material, e.g., a summary of the conditions under which the NRC can “request” a licensee to perform a SC assessment, a set of case studies and sample test questions for trainees.

**  The Common Language Initiative led to NUREG-2165, “Safety Culture Common Language” which was published in early 2014 and we reviewed on April 6, 2014.

***  There are some complicated decision rules for determining when a problem is a substantive cross-cutting issue and these are worth reviewing on pp. 27-28.

Monday, November 3, 2014

A Life In Error by James Reason

Most of us associate psychologist James Reason with the “Swiss Cheese Model” of defense in depth or possibly the notion of a “just culture.”  But his career has been more than two ideas, he has literally spent his professional life studying errors, their causes and contexts.  A Life In Error* is an academic memoir, recounting his study of errors starting with the individual and ending up with the organization (the “system”) including its safety culture (SC).  This post summarizes relevant portions of the book and provides our perspective.  It is going to read like a sub-titled movie on fast-forward but there are a lot of particulars packed in this short (124 pgs.) book. 

Slips and Mistakes 

People make plans and take action, consequences follow.  Errors occur when the intended goals are not achieved.  The plan may be adequate but the execution faulty because of slips (absent-mindedness) or trips (clumsy actions).  A plan that was inadequate to begin with is a mistake which is usually more subtle than a slip, and may go undetected for long periods of time if no obviously bad consequences occur. (pp. 10-12)  A mistake is a creation of higher-level mental activity than a slip.  Both slips and mistakes can take “strong but wrong” forms, where schema** that were effective in prior situations are selected even though they are not appropriate in the current situation.

Absent-minded slips can occur from misapplied competence where a planned routine is sidetracked into an unplanned one.  Such diversions can occur, for instance, when one’s attention is unexpectedly diverted just as one reaches a decision point and multiple schema are both available and actively vying to be chosen. (pp. 21-25)  Reason’s recipe for absent-minded errors is one part cognitive under-specification, e.g., insufficient knowledge, and one part the existence of an inappropriate response primed by prior, recent use and the situational conditions. (p. 49) 

Planning Biases 

The planning activity is subject to multiple biases.  An individual planner’s database may be incomplete or shaped by past experiences rather than future uncertainties, with greater emphasis on past successes than failures.  Planners can underestimate the influence of chance, overweight data that is emotionally charged, be overly influenced by their theories, misinterpret sample data or miss covariations, suffer hindsight bias or be overconfident.***  Once a plan is prepared, planners may focus only on confirmatory data and are usually resistant to changing the plan.  Planning in a group is subject to “groupthink” problems including overconfidence, rationalization, self-censorship and an illusion of unanimity.  (pp. 56-62)

Errors and Violations 

Violations are deliberate acts to break rules or procedures, although bad outcomes are not generally intended.  Violations arise from various motivational factors including the organizational culture.  Types of violations include corner-cutting to avoid clumsy procedures, necessary violations to get the job done because the procedures are unworkable, adjustments to satisfy conflicting goals and one-off actions (such as turning off a safety system) when faced with exceptional circumstances.  Violators perform a type of cost:benefit analysis biased by the fact that benefits are likely immediate while costs, if they occur, are usually somewhere in the future.  In Reason’s view, the proper course for the organization is to increase the perceived benefits of compliance not increase the costs (penalties) for violations.  (There is a hint of the “just culture” here.) 

Organizational Accidents 

Major accidents (TMI, Chernobyl, Challenger) have three common characteristics: contributing factors that were latent in the system, multiple levels of defense, and an unforeseen combination of latent factors and active failures (errors and/or violations) that defeated the defenses.  This is the well-known Swiss Cheese Model with the active failures opening short-lived holes and latent failures creating longer-lasting but unperceived holes.

Organizational accidents are low frequency, high severity events with causes that may date back years.  In contrast, individual accidents are more frequent but have limited consequences; they arise from slips, trips and lapses.  This is why organizations can have a good industrial accident record while they are on the road to a large-scale disaster, e.g., BP at Texas City. 

Organizational Culture 

Certain industries, including nuclear power, have defenses-in-depth distributed throughout the system but are vulnerable to something that is equally widespread.  According to Reason, “The most likely candidate is safety culture.  It can affect all elements in a system for good or ill.” (p. 81)  An inadequate SC can undermine the Swiss Cheese Model: there will be more active failures at the “sharp end”; more latent conditions created and sustained by management actions and policies, e.g., poor maintenance, inadequate equipment or downgrading training; and the organization will be reluctant to deal proactively with known problems. (pp. 82-83)

Reason describes a “cluster of organizational pathologies” that make an adverse system event more likely: blaming sharp-end operators, denying the existence of systemic inadequacies, and a narrow pursuit of production and financial goals.  He goes on to list some of the drivers of blame and denial.  The list includes: accepting human error as the root cause of an event; the hindsight bias; evaluating prior decisions based on their outcomes; shooting the messenger; belief in a bad apple but not a bad barrel (the system); failure to learn; a climate of silence; workarounds that compensate for systemic inadequacies’ and normalization of deviance.  (pp. 86-92)  Whew! 

Our Perspective 

Reason teaches us that the essence of understanding errors is nuance.  At one end of the spectrum, some errors are totally under the purview of the individual, at the other end they reside in the realm of the system.  The biases and issues described by Reason are familiar to Safetymatters readers and echo in the work of Dekker, Hollnagel, Kahneman and others.  We have been pounding the drum for a long time on the inadequacies of safety analyses that ignore systemic issues and corrective actions that are limited to individuals (e.g., more training and oversight, improved procedures and clearer expectations).

The book is densely packed with the work of a career.  One could easily use the contents to develop a SC assessment or self-assessment.  We did not report on the chapters covering research into absent-mindedness, Freud and medical errors (Reason’s current interest) but they are certainly worth reading.

Reason says this book is his valedictory: “I have nothing new to say and I’m well past my prime.” (p. 122)  We hope not.

*  J. Reason, A Life In Error: From Little Slips to Big Disasters (Burlington, VT: Ashgate, 2013).

**  Knowledge structures in long-term memory. (p. 24)

***  This will ring familiar to readers of Daniel Kahneman.  See our Dec. 18, 2013 post on Kahneman’s Thinking, Fast and Slow.

Tuesday, January 21, 2014

Lessons Learned from “Lessons Learned”: The Evolution of Nuclear Power Safety after Accidents and Near-Accidents by Blandford and May

This publication appeared on a nuclear safety online discussion board.*  It is a high-level review of significant commercial nuclear industry incidents and the subsequent development and implementation of related lessons learned.  This post summarizes and evaluates the document then focuses on its treatment of nuclear safety culture (SC). 

The authors cover Three Mile Island (1979), Chernobyl (1986), Le Blayais [France] plant flooding (1999), Davis-Besse (2002), U.S. Northeast Blackout (2003) and Fukushima-Daiichi (2011).  There is a summary of each incident followed by the major lessons learned, usually gleaned from official reports on the incident. 

Some lessons learned led to significant changes in the nuclear industry, other lessons learned were incompletely implemented or simply ignored.  In the first category, the creation of INPO (Institute of Nuclear Power Operations) after TMI was a major change.**  On the other hand, lessons learned from Chernobyl were incompletely implemented, e.g., WANO (World Association of Nuclear Operators, a putative “global INPO”) was created but it has no real authority over operators.  Fukushima lessons learned have focused on design, communication, accident response and regulatory deficiencies; implementation of any changes remains a work in progress.

The authors echo some concerns we have raised elsewhere on this blog.  For example, they note “the likelihood of a rare external event at some site at some time over the lifetime of a reactor is relatively high.” (p. 16)  And “the industry should look at a much higher probability of problems than is implied in the “once in a thousand years” viewpoint.” (p. 26)  Such cautions are consistent with Taleb's and Dédale's warnings that we have discussed here and here.

The authors also say “Lessons can also be learned from successes.” (p. 3)  We agree.  That's why our recommendation that managers conduct periodic in-depth analyses of plant decisions includes decisions that had good outcomes, in addition to those with poor outcomes.

Arguably the most interesting item in the report is a table that shows deaths attributable to different types of electricity generation.  Death rates range from 161 (per TWh) for coal to 0.04 for nuclear.  Data comes from multiple sources and we made no effort to verify the analysis.***

On Safety Culture

The authors say “. . . a culture of safety must be adopted by all operating entities. For this to occur, the tangible benefits of a safety culture must become clear to operators.” (p. 2, repeated on p. 25)  And “The nuclear power industry has from the start been aware of the need for a strong and continued emphasis on the safety culture, . . .” (p. 24)  That's it for the direct mention of SC.

Such treatment is inexcusably short shrift for SC.  There were obvious, major SC issues at many of the plants the authors discuss.  At Chernobyl, the culture permitted, among other things, testing that violated the station's own safety procedures.  At Davis-Besse, the culture prioritized production over safety—a fact the authors note without acknowledging its SC significance.  The combination of TEPCO's management culture which simply ignored inconvenient facts and their regulator's “see no evil” culture helped turn a significant plant event at Fukushima into an abject disaster.

Our Perspective

It's not clear who the intended audience is for this document.  It was written by two professors under the aegis of the American Academy of Arts and Sciences, an organization that, among other things, “provides authoritative and nonpartisan policy advice to decision-makers in government, academia, and the private sector.”****  While it is a nice little history paper, I can't see it moving the dial in any public policy discussion.  The scholarship in this article is minimal; it presents scant analysis and no new insights.  Its international public policy suggestions are shallow and do not adequately recognize disparate, even oppositional, national interests.  Perhaps you could give it to non-nuclear folks who express interest in the unfavorable events that have occurred in the nuclear industry. 

*  E.D. Blandford and M.M. May, “Lessons Learned from “Lessons Learned”: The Evolution of Nuclear Power Safety after Accidents and Near-Accidents” (Cambridge, MA: American Academy of Arts and Sciences, 2012).  Thanks to Madalina Tronea for publicizing this article on the LinkedIn Nuclear Safety group discussion board.  Dr. Tronea is the group's founder/moderator.

**  This publication is a valentine for INPO and, to a lesser extent, the U.S. nuclear navy.  INPO is hailed as “extraordinarily effective” (p. 12) and “a well-balanced combination of transparency and privacy; . . .” (p. 25)

***  It is the only content that demonstrates original analysis by the authors.

****  American Academy of Arts and Sciences website (retrieved Jan. 20, 2014).

Thursday, August 29, 2013

Normal Accidents by Charles Perrow

This book*, originally published in 1984, is a regular reference for authors writing about complex socio-technical systems.**  Perrow's model for classifying such systems is intuitively appealing; it appears to reflect the reality of complexity without forcing the reader to digest a deliberately abstruse academic construct.  We will briefly describe the model then spend most of our space discussing our problems with Perrow's inferences and assertions, focusing on nuclear power.  

The Model

The model is a 2x2 matrix with axes of coupling and interactions.  Not surprisingly, it is called the Interaction/Coupling (IC) chart.

“Coupling” refers to the amount of slack, buffer or give between two items in a system.  Loosely coupled systems can accommodate shocks, failures and pressures without destabilizing.  Tightly coupled systems have a higher risk of disastrous failure because their processes are more time-dependent, with invariant sequences and a single way of achieving the production goal, and have little slack. (pp. 89-94)

“Interactions” may be linear or complex.  Linear interactions are between a system component and one or more other components that immediately precede or follow it in the production sequence.  These interactions are familiar and, if something unplanned occurs, the results are easily visible.  Complex interactions are between a system component and one or more other components outside the normal production sequence.  If unfamiliar, unplanned or unexpected sequences occur, the results may not be visible or immediately comprehensible. (pp. 77-78)

Nuclear plants have the tightest coupling and most complex interactions of the two dozen systems Perrow shows on the I/C chart, a population that included chemical plants, space missions and nuclear weapons accidents. (p. 97)

Perrow on Nuclear Power

Let's get one thing out of the way immediately: Normal Accidents is an anti-nuke screed.  Perrow started the book in 1979 and it was published in 1984.  He was motivated to write the book by the TMI accident and it obviously colored his forecast for the industry.  He reviews the TMI accident in detail, then describes nuclear industry characteristics and incidents at other plants, all of which paint an unfavorable portrait of the industry.  He concludes: “We have not had more serious accidents of the scope of Three Mile Island simply because we have not given them enough time to appear.” (p. 60, emphasis added)  While he is concerned with design, construction and operating problems, his primary fear is “the potential for unexpected interactions of small failures in that system that makes it prone to the system accident.” (p. 61)   

Why has his prediction of such serious accidents not come to pass, at least in the U.S.?

Our Perspective on Normal Accidents

We have several issues with this book and the author's “analysis.”

Nuclear is not as complex as Perrow asserts 

There is no question that the U.S. nuclear industry grew quickly, with upsized plants and utilities specifying custom design combinations (in other words, limited standardization).  The utilities were focused on meeting significant load growth forecasts and saw nuclear baseload capacity as an efficient way to produce electric power.  However, actually operating a large nuclear plant was probably more complex than the utilities realized.  But not any more.  Learning curve effects, more detailed procedures and improved analytic methods are a few of the factors that led to a greater knowledge basis for plant decision making.  The serious operational issues at the “problem plants” (circa 1997) forced operators to confront the reality that identifying and permanently resolving plant problems was necessary for survival.  This era also saw the beginning of industry consolidation, with major operators applying best methods throughout their fleets.  All of these changes have led to our view that nuclear plants are certainly complicated but no longer complex and haven't been for some time.    

This is a good place to point out that Perrow's designation of nuclear plants as the most complex and tightest coupled systems he evaluated has no basis in any real science.  In his own words, “The placement of systems [on the interaction/coupling chart] is based entirely on subjective judgments on my part; at present there is no reliable way to measure these two variables, interaction and coupling.” (p. 96)

System failures with incomprehensible consequences are not the primary problem in the nuclear industry

The 1986 Chernobyl disaster was arguably a system failure: poor plant design, personnel non-compliance with rules and a deficient safety culture.  It was a serious accident but not a catastrophe.*** 

But other significant industry events have not arisen from interactions deep within the system; they have come from negligence, hubris, incompetence or selective ignorance.  For example, Fukushima was overwhelmed by a tsunami that was known to be possible but was ignored by the owners.  At Davis-Besse, personnel ignored increasingly stronger signals of a nascent problem but managers argued that in-depth investigation could wait until the next outage (production trumps safety) and the NRC agreed (with no solid justification).  

Important system dynamics are ignored 

Perrow has some recognition of what a system is and how threats can arise within it: “. . . it is the way the parts fit together, interact, that is important.  The dangerous accidents lie in the system, not in the components.” (p. 351)  However, he is/was focused on interactions and couplings as they currently exist.  But a socio-technical system is constantly changing (evolving, learning) in response to internal and external stimuli.  Internal stimuli include management decisions and the reactions to performance feedback signals; external stimuli include environmental demands, constraints, threats and opportunities.  Complacency and normalization of deviance can seep in but systems can also bolster their defenses and become more robust and resilient.****  It would be a stretch to say that nuclear power has always learned from its mistakes (especially if they occur at someone else's plant) but steps have been taken to make operations less complex. 

My own bias is Perrow doesn't really appreciate the technical side of a socio-technical system.  He recounts incidents in great detail, but not at great depth and is often recounting the work of others.  Although he claims the book is about technology (the socio side, aka culture, is never mentioned), the fact remains that he is not an engineer or physicist; he is a sociologist.


Notwithstanding all my carping, this is a significant book.  It is highly readable.  Perrow's discussion of accidents, incidents and issues in various contexts, including petrochemical plants, air transport, marine shipping and space exploration, is fascinating reading.  His interaction/coupling chart is a useful mental model to help grasp relative system complexity although one must be careful about over-inferring from such a simple representation.

There are some useful suggestions, e.g., establishing an anonymous reporting system, similar to the one used in the air transport industry, for nuclear near-misses. (p. 169)  There is a good discussion of decentralization vs centralization in nuclear plant organizations. (pp. 334-5)  But he says that neither is best all the time, which he considers a contradiction.  The possibility of contingency management, i.e., using a decentralized approach for normal times and tightening up during challenging conditions, is regarded as infeasible.

Ultimately, he includes nuclear power with “systems that are hopeless and should be abandoned because the inevitable risks outweigh any reasonable benefits . . .” (p. 304)*****  As further support for this conclusion, he reviews three different ways of evaluating the world: absolute, bounded and social rationality.  Absolute rationality is the province of experts; bounded rationality recognizes resource and cognitive limitations in the search for solutions.  But Perrow favors social rationality (which we might unkindly call crowdsourced opinions) because it is the most democratic and, not coincidentally, he can cite a study that shows an industry's “dread risk” is highly correlated with its position on the I/C chart. (p. 326)  In other words, if lots of people are fearful of nuclear power, no matter how unreasonable those fears are, that is further evidence to shut it down.

The 1999 edition of Normal Accidents has an Afterword that updates the original version.  Perrow continues to condemn nuclear power but without much new data.  Much of his disapprobation is directed at the petrochemical industry.  He highlights writers who have advanced his ideas and also presents his (dis)agreements with high reliability theory and Vaughn's interpretation of the Challenger accident.

You don't need this book in your library but you do need to be aware that it is a foundation stone for the work of many other authors.


*  C. Perrow, Normal Accidents: Living with High-Risk Technologies (Princeton Univ. Press, Princeton, NJ: 1999).

**  For example, see Erik Hollnagel, The ETTO Principle: Efficiency-Thoroughness Trade-Off (reviewed here); Woods, Dekker et al, Behind Human Error (reviewed here); and Weick and Sutcliffe, Managing the Unexpected: Resilient Performance in an Age of Uncertainty (reviewed here).  It's ironic that Perrow set out to write a readable book without references to the “sacred texts” (p. 11) but it appears Normal Accidents has become one.

***  Perrow's criteria for catastrophe appear to be: “kill many people, irradiate others, and poison some acres of land.” (p. 348)  While any death is a tragedy, reputable Chernobyl studies report fewer than 100 deaths from radiation and project 4,000 radiation-induced cancers in a population of 600,000 people who were exposed.  The same population is expected to suffer 100,000 cancer deaths from all other causes.  Approximately 40,000 square miles of land was significantly contaminated.  Data from Chernobyl Forum, "Chernobyl's Legacy: Health, Environmental and Socio-Economic Impacts" 2nd rev. ed.  Retrieved Aug. 27, 2013.  Wikipedia, “Chernobyl disaster.”  Retrieved Aug. 27, 2013.

In his 1999 Afterword to Normal Accidents, Perrow mentions Chernobyl in passing and his comments suggest he does not consider it a catastrophe but could have been had the wind blown the radioactive materials over the city of Kiev.

****  A truly complex system can drift into failure (Dekker) or experience incidents from performance excursions outside the safety boundaries (Hollnagel).

*****  It's not just nuclear power, Perrow also supports unilateral nuclear disarmament. (p. 347)

Thursday, December 20, 2012

The Logic of Failure by Dietrich Dörner

This book was mentioned in a nuclear safety discussion forum so we figured this is a good time to revisit Dörner's 1989 tome.* Below we provide a summary of the book followed by our assessment of how it fits into our interest in decision making and the use of simulations in training.

Dörner's work focuses on why people fail to make good decisions when faced with problems and challenges. In particular, he is interested in the psychological needs and coping mechanisms people exhibit. His primary research method is observing test subjects interact with simulation models of physical sub-worlds, e.g., a malfunctioning refrigeration unit, an African tribe of subsistence farmers and herdsmen, or a small English manufacturing city. He applies his lessons learned to real situations, e.g, the Chernobyl nuclear plant accident.

He proposes a multi-step process for improving decision making in complicated situations then describes each step in detail and the problems people can create for themselves while executing the step. These problems generally consist of tactics people adopt to preserve their sense of competence and control at the expense of successfully achieving overall objectives. Although the steps are discussed in series, he recognizes that, at any point, one may have to loop back through a previous step.

Goal setting

Goals should be concrete and specific to guide future steps. The relationships between and among goals should be specified, including dependencies, conflicts and relative importance. When people don't to do this, they can become distracted by obvious or unimportant (although potentially achievable) goals, or peripheral issues they know how to address rather than important issues that should be resolved. Facing performance failure, they may attempt to turn failure into success with doublespeak or blame unseen forces.

Formulate models and gather information

Good decision-making requires an adequate mental model of the system being studied—the variables that comprise the system and the functional relationships among them, which may include positive and negative feedback loops. The model's level of detail should be sufficient to understand the interrelationships among the variables the decision maker wants to influence. Unsuccessful test subjects were inclined to use a “reductive hypothesis,” which unreasonably reduces the model to a single key variable, or overgeneralization.

Information gathered is almost always incomplete and the decision maker has to decide when he has enough to proceed. The more successful test subjects asked more questions and made fewer decisions (then the less successful subjects) in the early time periods of the sim.

Predict and extrapolate

Once a model is formulated, the decision maker must attempt to determine how the values of variables will change over time in response to his decisions or internal system dynamics. One problem is predicting that outputs will change in a linear fashion, even as the evidence grows for a non-linear, e.g., exponential function. An exponential variable may suddenly grow dramatically then equally suddenly reverse course when the limits on growth (resources) are reached. Internal time delays mean that the effects of a decision are not visible until some time in the future. Faced with poor results, unsuccessful test subjects implement or exhibit “massive countermeasures, ad hoc hypotheses that ignore the actual data, underestimations of growth processes, panic reactions, and ineffectual frenetic activity.” (p. 152) Successful subjects made an effort to understand the system's dynamics, kept notes (history) on system performance and tried to anticipate what would happen in the future.

Plan and execute actions, check results and adjust strategy

The essence of planning is to think through the consequences of certain actions and see whether those actions will bring us closer to our desired goal.” (p. 153) Easier said than done in an environment of too many alternative courses of action and too little time. In rapidly evolving situations, it may be best to create rough plans and delegate as many implementing decisions as possible to subordinates. A major risk is thinking that planning has been so complete than the unexpected cannot occur. A related risk is the reflexive use of historically successful strategies. “As at Chernobyl, certain actions carried out frequently in the past, yielding only the positive consequences of time and effort saved and incurring no negative consequences, acquire the status of an (automatically applied) ritual and can contribute to catastrophe.” (p. 172)

In the sims, unsuccessful test subjects often exhibited “ballistic” behavior—they implemented decisions but paid no attention to, i.e, did not learn from, the results. Successful subjects watched for the effects of their decisions, made adjustments and learned from their mistakes.

Dörner identified several characteristics of people who tended to end up in a failure situation. They failed to formulate their goals, didn't recognize goal conflict or set priorities, and didn't correct their errors. (p. 185) Their ignorance of interrelationships among system variables and the longer-term repercussions of current decisions set the stage for ultimate failure.


Dörner's insights and models have informed our thinking about human decision-making behavior in demanding, complicated situations. His use and promotion of simulation models as learning tools was one starting point for Bob Cudlin's work in developing a nuclear management training simulation program. Like Dörner, we see simulation as a powerful tool to “observe and record the background of planning, decision making, and evaluation processes that are usually hidden.” (pp. 9-10)

However, this book does not cover the entire scope of our interests. Dörner is a psychologist interested in individuals, group behavior is beyond his range. He alludes to normalization of deviance but his references appear limited to the flaunting of safety rules rather than a more pervasive process of slippage. More importantly, he does not address behavior that arises from the system itself, in particular adaptive behavior as an open system reacts to and interacts with its environment.

From our view, Dörner's suggestions may help the individual decision maker avoid common pitfalls and achieve locally optimum answers. On the downside, following Dörner's prescription might lead the decision maker to an unjustified confidence in his overall system management abilities. In a truly complex system, no one knows how the entire assemblage works. It's sobering to note that even in Dörner's closed,** relatively simple models many test subjects still had a hard time developing a reasonable mental model, and some failed completely.

This book is easy to read and Dörner's insights into the psychological traps that limit human decision making effectiveness remain useful.

* D. Dörner, The Logic of Failure: Recognizing and Avoiding Error in Complex Situations, trans. R. and R. Kimber (Reading, MA: Perseus Books, 1998). Originally published in German in 1989.

** One simulation model had an external input.

Friday, February 4, 2011

“I Hope For All Our Sakes This is Right”

On January 24, 2011 the NRC Commissioners met to review the proposed policy statement on nuclear safety culture developed by the NRC staff. This most recent effort was chartered by the Commission more than 3 years ago and represents the next step in the process to publish the proposed statement for public comment.

“25 years is long enough to build a policy statement…” for nuclear safety culture. This observation by Billie Garde* in her opening remarks to the Commissioners, with her timeline referring to the Chernobyl and space shuttle Challenger accidents in 1986. She also emphasized that the need was to now focus on implementation of the policy statement. She maintained her position that a policy statement alone would not be sufficient and that regulation would be necessary to assure consistent and reliable implementation.

In that regard she lays claim to one of the more disconcerting observations made at the meeting, the gist of which can be summed up as, “I hope for all our sakes this is right…”

Here’s the video clip with the exchange between Garde and Commissioner Apostolakis.

We will be following up with additional posts with highlights from the Commission session.

*  Billie Garde is an attorney in Washington, D.C.  Her NRC website bio is here.