Friday, December 28, 2012

Uh-oh, Delays at Vogtle

This Wall Street Journal article* reports that the new Vogtle units may be in construction schedule trouble. The article notes that the new, modular construction techniques being employed were expected to save time and dollars but may be having the opposite effect. In addition, and somewhat incredibly, the independent monitor is citing design changes as another cause of delays. Thought that lesson had been learned a hundred times in the nuclear industry.

Then there is the inevitable finger pointing:

“The delays and cost pressures have created friction between the construction partners and utility companies that will serve as the plant's owners, escalating into a series of lawsuits totaling more than $900 million.”

The Vogtle situation also serves as a reminder that nuclear safety culture (NSC) is applicable to the construction phase though to our recollection, there was not a lot of talk about it during the NRC’s policy statement development process. The escalating schedule and cost pressures at Vogtle also serve to remind us of how significant a factor such pressures can be in a “massive, complex, first-of-a-kind project” (to quote the Westinghouse spokesman). These situational conditions will be challenging construction workers and management who may not possess the same level of NSC experience or consciousness as nuclear operating organizations.


* R. Smith, “New Nuclear Plant HitsSome Snags,” Wall Street Journal online (Dec. 23, 2012).

Thursday, December 20, 2012

The Logic of Failure by Dietrich Dörner

This book was mentioned in a nuclear safety discussion forum so we figured this is a good time to revisit Dörner's 1989 tome.* Below we provide a summary of the book followed by our assessment of how it fits into our interest in decision making and the use of simulations in training.

Dörner's work focuses on why people fail to make good decisions when faced with problems and challenges. In particular, he is interested in the psychological needs and coping mechanisms people exhibit. His primary research method is observing test subjects interact with simulation models of physical sub-worlds, e.g., a malfunctioning refrigeration unit, an African tribe of subsistence farmers and herdsmen, or a small English manufacturing city. He applies his lessons learned to real situations, e.g, the Chernobyl nuclear plant accident.

He proposes a multi-step process for improving decision making in complicated situations then describes each step in detail and the problems people can create for themselves while executing the step. These problems generally consist of tactics people adopt to preserve their sense of competence and control at the expense of successfully achieving overall objectives. Although the steps are discussed in series, he recognizes that, at any point, one may have to loop back through a previous step.

Goal setting

Goals should be concrete and specific to guide future steps. The relationships between and among goals should be specified, including dependencies, conflicts and relative importance. When people don't to do this, they can become distracted by obvious or unimportant (although potentially achievable) goals, or peripheral issues they know how to address rather than important issues that should be resolved. Facing performance failure, they may attempt to turn failure into success with doublespeak or blame unseen forces.

Formulate models and gather information

Good decision-making requires an adequate mental model of the system being studied—the variables that comprise the system and the functional relationships among them, which may include positive and negative feedback loops. The model's level of detail should be sufficient to understand the interrelationships among the variables the decision maker wants to influence. Unsuccessful test subjects were inclined to use a “reductive hypothesis,” which unreasonably reduces the model to a single key variable, or overgeneralization.

Information gathered is almost always incomplete and the decision maker has to decide when he has enough to proceed. The more successful test subjects asked more questions and made fewer decisions (then the less successful subjects) in the early time periods of the sim.

Predict and extrapolate

Once a model is formulated, the decision maker must attempt to determine how the values of variables will change over time in response to his decisions or internal system dynamics. One problem is predicting that outputs will change in a linear fashion, even as the evidence grows for a non-linear, e.g., exponential function. An exponential variable may suddenly grow dramatically then equally suddenly reverse course when the limits on growth (resources) are reached. Internal time delays mean that the effects of a decision are not visible until some time in the future. Faced with poor results, unsuccessful test subjects implement or exhibit “massive countermeasures, ad hoc hypotheses that ignore the actual data, underestimations of growth processes, panic reactions, and ineffectual frenetic activity.” (p. 152) Successful subjects made an effort to understand the system's dynamics, kept notes (history) on system performance and tried to anticipate what would happen in the future.

Plan and execute actions, check results and adjust strategy

The essence of planning is to think through the consequences of certain actions and see whether those actions will bring us closer to our desired goal.” (p. 153) Easier said than done in an environment of too many alternative courses of action and too little time. In rapidly evolving situations, it may be best to create rough plans and delegate as many implementing decisions as possible to subordinates. A major risk is thinking that planning has been so complete than the unexpected cannot occur. A related risk is the reflexive use of historically successful strategies. “As at Chernobyl, certain actions carried out frequently in the past, yielding only the positive consequences of time and effort saved and incurring no negative consequences, acquire the status of an (automatically applied) ritual and can contribute to catastrophe.” (p. 172)

In the sims, unsuccessful test subjects often exhibited “ballistic” behavior—they implemented decisions but paid no attention to, i.e, did not learn from, the results. Successful subjects watched for the effects of their decisions, made adjustments and learned from their mistakes.

Dörner identified several characteristics of people who tended to end up in a failure situation. They failed to formulate their goals, didn't recognize goal conflict or set priorities, and didn't correct their errors. (p. 185) Their ignorance of interrelationships among system variables and the longer-term repercussions of current decisions set the stage for ultimate failure.

Assessment

Dörner's insights and models have informed our thinking about human decision-making behavior in demanding, complicated situations. His use and promotion of simulation models as learning tools was one starting point for Bob Cudlin's work in developing a nuclear management training simulation program. Like Dörner, we see simulation as a powerful tool to “observe and record the background of planning, decision making, and evaluation processes that are usually hidden.” (pp. 9-10)

However, this book does not cover the entire scope of our interests. Dörner is a psychologist interested in individuals, group behavior is beyond his range. He alludes to normalization of deviance but his references appear limited to the flaunting of safety rules rather than a more pervasive process of slippage. More importantly, he does not address behavior that arises from the system itself, in particular adaptive behavior as an open system reacts to and interacts with its environment.

From our view, Dörner's suggestions may help the individual decision maker avoid common pitfalls and achieve locally optimum answers. On the downside, following Dörner's prescription might lead the decision maker to an unjustified confidence in his overall system management abilities. In a truly complex system, no one knows how the entire assemblage works. It's sobering to note that even in Dörner's closed,** relatively simple models many test subjects still had a hard time developing a reasonable mental model, and some failed completely.

This book is easy to read and Dörner's insights into the psychological traps that limit human decision making effectiveness remain useful.


* D. Dörner, The Logic of Failure: Recognizing and Avoiding Error in Complex Situations, trans. R. and R. Kimber (Reading, MA: Perseus Books, 1998). Originally published in German in 1989.

** One simulation model had an external input.

Wednesday, December 12, 2012

“Overpursuit” of Goals

We return to a favorite subject, the impact of goals and incentives on safety culture and performance. Interestingly this subject comes up in an essay by Oliver Burkeman, “The Power of Negative Thinking,”* which may seem unusual as most people think of goals and achievement of goals as the product of a positive approach. Traditional business thinking is to set hard, quantitative goals, the bigger the better. But futures are inherently uncertain and goals generally are not so. The counter intuitive argument suggests the most effective way to address future performance is to focus on worst case outcomes. Burkeman observes that “...rigid goals may encourage employees to cut ethical corners” and “Focusing on one goal at the expense of all other factors also can distort a corporate mission or an individual life…” and result in “...the ‘overpursuit’ of goals…” Case in point, yellow jerseys.

This raises some interesting points for nuclear safety. First we would remind our readers of Snowden’s Cynefin decision context framework, specifically his “complex” space which is indicative of where nuclear safety decisions reside. In this environment there are many interacting causes and effects, making it difficult or impossible to pursue specific goals along defined paths. Clearly an uncertain landscape. As Simon French argues: “Decision support will be more focused on exploring judgement and issues, and on developing broad strategies that are flexible enough to accommodate changes as the situation evolves.”** This would suggest the pursuit of specific, aspirational goals may be misguided or counterproductive.

Second, safety performance goals are hard to identify anyway. Is it the absence of bad outcomes? Or the maintenance of, say, a “strong” safety culture - whatever that is. One indication of the elusiveness of safety goals is their absence as targets in incentive programs. So there is probably little likelihood of overemphasizing safety performance as a goal. But is the same true for operational type goals such as capacity factor, refuel outage durations, and production costs? Can an overly strong focus on such short term goals, often associated with stretching performance, lead to overpursuit? What if large financial incentives are attached to the achievement of the goals?

The answer is not: “Safety is our highest priority”. More likely it is an approach that considers the complexity and uncertainty of nuclear operating space and the potential for hard goals to cut both ways. It might value how a management team prosecutes its responsibilities more than the outcome itself.


* O. Burkeman, “The Power of Negative Thinking,” Wall Street Journal online (Dec. 7, 2012).

** S. French, “Cynefin: repeatability, science and values,” Newsletter of the European Working Group “Multiple Criteria Decision Aiding,” series 3, no. 17 (Spring 2008) p. 2. We posted on Cynefin and French's paper here.

Wednesday, December 5, 2012

Drift Into Failure by Sydney Dekker

Sydney Dekker's Drift Into Failure* is a noteworthy effort to provide new insights into how accidents and other bad outcomes occur in large organizations. He begins by describing two competing world views, the essentially mechanical view of the world spawned by Newton and Descartes (among others), and a view based on complexity in socio-technical organizations and a systems approach. He shows how each world view biases the search for the “truth” behind how accidents and incidents occur.

Newtonian-Cartesian (N-C) Vision

Issac Newton and Rene Descartes were leading thinkers during the dawn of the Age of Reason. Newton used the language of mathematics to describe the world while Descartes relied on the inner process of reason. Both believed there was a single reality that could be investigated, understood and explained through careful analysis and thought—complete knowledge was possible if investigators looked long and hard enough. The assumptions and rules that started with them, and were extended by others over time, have been passed on and most of us accept them, uncritically, as common sense, the most effective way to look at the world.

The N-C world is ruled by invariant cause-and-effect; it is, in fact, a machine. If something bad happens, then there was a unique cause or set of causes. Investigators search for these broken components, which could be physical or human. It is assumed that a clear line exists between the broken part(s) and the overall behavior of the system. The explicit assumption of determinism leads to an implicit assumption of time reversibility—because system performance can be predicted from time A if we know the starting conditions and the functional relationships of all components, then we can start from a later time B (the bad outcome) and work back to the true causes. (p. 84) Root cause analysis and criminal investigations are steeped in this world view.

In this view, decision makers are expected to be rational people who “make decisions by systematically and consciously weighing all possible outcomes along all relevant criteria.” (p. 3) Bad outcomes are caused by incompetent or worse, corrupt decision makers. Fixes include more communications, training, procedures, supervision, exhortations to try harder and criminal charges.

Dekker credits Newton et al for giving man the wherewithal to probe Nature's secrets and build amazing machines. However, Newtonian-Cartesian vision is not the only way to view the world, especially the world of complex, socio-technical systems. For that a new model, with different concepts and operating principles, is required.

The Complex System

Characteristics

The sheer number of parts does not make a system complex, only complicated. A truly complex system is open (it interacts with its environment), has components that act locally and don't know the full effects of their actions, is constantly making decisions to maintain performance and adapt to changing circumstances, and has non-linear interactions (small events can cause large results) because of multipliers and feedback loops. Complexity is a result of the ever-changing relationships between components. (pp.138-144)

Adding to the myriad information confronting a manager or observer, system performance is often optimized at the edge of chaos, where competitors are perpetually vying for relative advantage at an affordable cost.** The system is constantly balancing its efforts between exploration (which will definitely incur costs but may lead to new advantages) and exploitation (which reaps benefits of current advantages but will likely dissipate over time). (pp. 164-165)

The most important feature of a complex system is that it adapts to its environment over time in order to survive. And its environment is characterized by resource scarcity and competition. There is continuous pressure to maintain production and increase efficiency (and their visible artifacts: output, costs, profits, market share, etc) and less visible outputs, e.g., safety, will receive less attention. After all, “Though safety is a (stated) priority, operational systems do not exist to be safe. They exist to provide a service or product . . . .” (p. 99) And the cumulative effect of multiple adaptive decisions can be an erosion of safety margins and a changed response of the entire system. Such responses may be beneficial or harmful—a drift into failure.

Drift by a complex system exhibits several characteristics. First, as mentioned above, it is driven by environmental factors. Second, drift occurs in small steps so changes can be hardly noticed, and even applauded if they result in local performance improvement; “. . . successful outcomes keep giving the impression that risk is under control” (p. 106) as a series of small decisions whittle away at safety margins. Third, these complex systems contain unruly technology (think deepwater drilling) where uncertainties exist about how the technology may be ultimately deployed and how it may fail. Fourth, there is significant interaction with a key environmental player, the regulator, and regulatory capture can occur, resulting in toothless oversight.

“Drifting into failure is not so much about breakdowns or malfunctioning of components, as it is about an organization not adapting effectively to cope with the complexity of its own structure and environment.” (p. 121) Drift and occasionally accidents occur because of ordinary system functioning, normal people going about their regular activities making ordinary decisions “against a background of uncertain technology and imperfect information.” Accidents, like safety, can be viewed as an emergent system property, i.e., they are the result of system relationships but cannot be predicted by examining any particular system component.

Managers' roles

Managers should not try to transform complex organizations into merely complicated ones, even if it's possible. Complexity is necessary for long-term survival as it maximizes organizational adaptability. The question is how to manage in a complex system. One key is increasing the diversity of personnel in the organization. More diversity means less group think and more creativity and greater capacity for adaptation. In practice, this means validation of minority opinions and encouragement of dissent, reflecting on the small decisions as they are made, stopping to ponder why some technical feature or process is not working exactly as expected and creating slack to reduce the chances of small events snowballing into large failures. With proper guidance, organizations can drift their way to success.

Accountability

Amoral and criminal behavior certainly exist in large organizations but bad outcomes can also result from normal system functioning. That's why the search for culprits (bad actors or broken parts) may not always be appropriate or adequate. This is a point Dekker has explored before, in Just Culture (briefly reviewed here) where he suggests using accountability as a means to understand the system-based contributors to failure and resolve those contributors in a manner that will avoid recurrence.

Application to Nuclear Safety Culture

A commercial nuclear power plant or fleet is probably not a complete complex system. It interacts with environmental factors but in limited ways; it's certainly not directly exposed to the Wild West competition of say, the cell phone industry. Group think and normalization of deviance*** is a constant threat. The technology is reasonably well-understood but changes, e.g., uprates based on more software-intensive instrumentation and control, may be invisibly sanding away safety margin. Both the industry and the regulator would deny regulatory capture has occurred but an outside observer may think the relationship is a little too cozy. Overall, the fit is sufficiently good that students of safety culture should pay close attention to Dekker's observations.

In contrast, the Hanford Waste Treatment Plant (Vit Plant) is almost certainly a complex system and this book should be required reading for all managers in that program.

Conclusion

Drift Into Failure is not a quick read. Dekker spends a lot of time developing his theory, then circling back to further explain it or emphasize individual pieces. He reviews incidents (airplane crashes, a medical error resulting in patient death, software problems, public water supply contamination) and descriptions of organization evolution (NASA, international drug smuggling, “conflict minerals” in Africa, drilling for oil, terrorist tactics, Enron) to illustrate how his approach results in broader and arguably more meaningful insights than the reports of official investigations. Standing on the shoulders of others, especially Diane Vaughan, Dekker gives us a rich model for what might be called the “banality of normalization of deviance.” 


* S. Dekker, Drift Into Failure: From Hunting Broken Components to Understanding Complex Systems (Burlington VT: Ashgate 2011).

** See our Sept. 4, 2012 post onCynefin for another description of how the decisions an organization faces can suddenly slip from the Simple space to the Chaotic space.

*** We have posted many times about normalization of deviance, the corrosive organizational process by which the yesterday's “unacceptable” becomes today's “good enough.”

Thursday, November 29, 2012

The Mouse Runs Up the Clock (at Massey Energy)

We are all familiar with the old nursery rhyme: “Hickory, dickory, dock, the mouse ran up the clock.”  This may be an apt description for the rising waters of federal criminal prosecution in the Massey coal mine explosion investigation.  As reported in the Nov. 28, 2012 Wall Street Journal,* the former president of one of the Massey operating units unrelated to the Upper Big Branch mine has agreed to plead guilty to felony conspiracy charges including directing employees to violate safety laws.  The former president is cooperating with prosecutors (in other words, look out above) and as noted in the Journal article, “The expanded probe ‘strongly suggests’ prosecutors are ‘looking at top management’…"   Earlier this year, a former superintendent at the Upper Big Branch pleaded guilty to conspiracy charges. 

Federal prosecutors allege that safety rules were routinely violated to maximize profits.  As stated in the Criminal Information against the former president, “Mine safety and health laws were routinely violated at the White Buck Mines and at other coal mines owned by Massey, in part because of a belief that consistently following those laws would decrease coal production.” (Criminal Information, p. 4)**  The Information goes on to state:  “Furthermore, the issuance of citations and orders by MSHA [Mine Safety and Health Administration], particularly certain kinds of serious citations and orders, moved the affected mine closer to being classified as a mine with a pattern or potential pattern of violations.  That classification would have resulted in increased scrutiny of the affected mine by MSHA…” (Crim. Info. p.5)  Thus it is alleged that not only production priorities - the core objective of many businesses - but even the potential for increased scrutiny by a regulatory authority was sufficient to form the basis for a conspiracy. 

Every day managers and executives in high risk businesses make decisions to sustain and/or improve production and to minimize the exposure of the operation to higher levels of regulatory scrutiny.  The vast majority of those decisions are legitimate and don’t compromise safety or inhibit regulatory functions.  Extreme examples that do violate safety and legal requirements, such as the Massey case, are easy to spot.  But one might begin to wonder what exactly is the boundary separating legitimate pursuit of these objectives and decisions or actions that might (later) be interpreted as having the intent to compromise safety or regulation?  How important is perception to drawing the boundary - where the context can frame a decision or action in markedly different colors?  Suppose in the Massey situation, the former president instead of providing advance warnings and (apparently) explicitly tolerating safety violations, had limited the funding of safety activities, or just squeezed total budgets?  Same or different? 


*  K. Maher, "Mine-Safety Probe Expands," Wall Street Journal online (Nov. 28, 2012) may only be available only to subscribers.

**  U.S. District Court Southern District of West Virgina, “Criminal Information for Conspiracy to Defraud the United States: United States of America v. David C. Hughart” (Nov. 28, 2012).


Tuesday, November 20, 2012

BP/Deepwater Horizon: Upping the Stakes

Anyone who thought safety culture and safety decision making was an institutional artifact, or mostly a matter of regulatory enforcement, might want to take a close look at what is happening on the BP/Deepwater Horizon front these days. Three BP employees have been criminally indicted - and two of those indictments bear directly on safety in operational decisions. The indictments of the well-site leaders, the most senior BP personnel on the platform, accuses them of causing the deaths of 11 crewmen aboard the Deepwater Horizon rig in April 2010 through gross negligence, primarily by misinterpreting a crucial pressure test that should have alerted them that the well was in trouble.*

The crux of the matter relates to the interpretation of a pressure test to determine whether the well had been properly sealed prior to being temporarily abandoned. Apparently BP’s own investigation found that the men had misinterpreted the test results.

The indictment states, “The Well Site Leaders were responsible for...ensuring that well drilling operations were performed safely in light of the intrinsic danger and complexity of deepwater drilling.” (Indictment p.3)

The following specific actions are cited as constituting gross negligence: “...failed to phone engineers onshore to advise them ...that the well was not secure; failed to adequately account for the abnormal readings during the testing; accepted a nonsensical explanation for the abnormal readings, again without calling engineers onshore to consult…” (Indictment p.7)

The willingness of federal prosecutors to advance these charges should (and perhaps are intended to) send a chill down every manager’s spine in high risk industries. While gross negligence is a relatively high standard, and may or may not be provable in the BP case, the actions cited in the indictment may not sound all that extraordinary - failure to consult with onshore engineers, failure to account for “abnormal” readings, accepting a “nonsensical” explanation. Whether this amounts to “reckless” or willful disregard for a known risk is a matter for the legal system. As an article in the Wall Street Journal notes, “There were no federal rules about how to conduct such a test at the time. That has since changed; federal regulators finalized new drilling rules last week that spell out test procedures.”**

The indictment asserts that the men violated the “standard of care” applicable to the deepwater oil exploration industry. One might ponder what federal prosecutors think the “standard of care” is for the nuclear power generation industry.
 

Clearly the well site leaders made a serious misjudgment - one that turned out to have catastrophic consequences. But then consider the statement by the Assistant Attorney General, that the accident was caused by “BP’s culture of privileging profit over prudence.” (WSJ article)   Are there really a few simple, direct causes of this accident or is this an example of a highly complex system failure? Where does culpability for culture lie?  Stay tuned.


* U.S. District Court Eastern District of Louisiana, “Superseding Indictment for Involuntary Manslaughter, Seaman's Manslaughter and Clean Water Act: United States of America v. Robert Kaluza and Donald Vidrine,” Criminal No. 12-265.


** T. Fowler and R. Gold, “Engineers Deny Charges in BP Spill,” Wall Street Journal online (Nov. 18, 2012).



Thursday, November 1, 2012

Practice Makes Perfect

In this post we call attention to a recent article from The Wall Street Journal* that highlights an aspect of safety culture “learning” that may not be appreciated with approaches currently in vogue in the nuclear industry.  The gist of the article is that, just as practice is useful in mastering complex, physically challenging activities, it may also have value in honing the skills inherent in complex socio-technical issues.

“Research has established that fast, simple feedback is almost always more effective at shaping behavior than is a more comprehensive response well after the fact. Better to whisper "Please use a more formal tone with clients, Steven" right away than to lecture Steven at length on the wherefores and whys the next morning.”

Our sense is current efforts to instill safety culture norms and values tend toward after-the-fact lectures and “death by PowerPoint” approaches.  As the article correctly points out, it is “shaping behavior” that should be the goal and is something that benefits from feedback, and “An explicit request can normalize the idea of ‘using’ rather than passively "taking" feedback.”

It’s not a long article so we hope readers will just go ahead and click on the link below.

*  Lemov, D., “Practice Makes Perfect—And Not Just for Jocks and Musicians,” Wall Street Journal online (Oct. 26, 2012).