Friday, December 28, 2012

Uh-oh, Delays at Vogtle

This Wall Street Journal article* reports that the new Vogtle units may be in construction schedule trouble. The article notes that the new, modular construction techniques being employed were expected to save time and dollars but may be having the opposite effect. In addition, and somewhat incredibly, the independent monitor is citing design changes as another cause of delays. Thought that lesson had been learned a hundred times in the nuclear industry.

Then there is the inevitable finger pointing:

“The delays and cost pressures have created friction between the construction partners and utility companies that will serve as the plant's owners, escalating into a series of lawsuits totaling more than $900 million.”

The Vogtle situation also serves as a reminder that nuclear safety culture (NSC) is applicable to the construction phase though to our recollection, there was not a lot of talk about it during the NRC’s policy statement development process. The escalating schedule and cost pressures at Vogtle also serve to remind us of how significant a factor such pressures can be in a “massive, complex, first-of-a-kind project” (to quote the Westinghouse spokesman). These situational conditions will be challenging construction workers and management who may not possess the same level of NSC experience or consciousness as nuclear operating organizations.


* R. Smith, “New Nuclear Plant HitsSome Snags,” Wall Street Journal online (Dec. 23, 2012).

Thursday, December 20, 2012

The Logic of Failure by Dietrich Dörner

This book was mentioned in a nuclear safety discussion forum so we figured this is a good time to revisit Dörner's 1989 tome.* Below we provide a summary of the book followed by our assessment of how it fits into our interest in decision making and the use of simulations in training.

Dörner's work focuses on why people fail to make good decisions when faced with problems and challenges. In particular, he is interested in the psychological needs and coping mechanisms people exhibit. His primary research method is observing test subjects interact with simulation models of physical sub-worlds, e.g., a malfunctioning refrigeration unit, an African tribe of subsistence farmers and herdsmen, or a small English manufacturing city. He applies his lessons learned to real situations, e.g, the Chernobyl nuclear plant accident.

He proposes a multi-step process for improving decision making in complicated situations then describes each step in detail and the problems people can create for themselves while executing the step. These problems generally consist of tactics people adopt to preserve their sense of competence and control at the expense of successfully achieving overall objectives. Although the steps are discussed in series, he recognizes that, at any point, one may have to loop back through a previous step.

Goal setting

Goals should be concrete and specific to guide future steps. The relationships between and among goals should be specified, including dependencies, conflicts and relative importance. When people don't to do this, they can become distracted by obvious or unimportant (although potentially achievable) goals, or peripheral issues they know how to address rather than important issues that should be resolved. Facing performance failure, they may attempt to turn failure into success with doublespeak or blame unseen forces.

Formulate models and gather information

Good decision-making requires an adequate mental model of the system being studied—the variables that comprise the system and the functional relationships among them, which may include positive and negative feedback loops. The model's level of detail should be sufficient to understand the interrelationships among the variables the decision maker wants to influence. Unsuccessful test subjects were inclined to use a “reductive hypothesis,” which unreasonably reduces the model to a single key variable, or overgeneralization.

Information gathered is almost always incomplete and the decision maker has to decide when he has enough to proceed. The more successful test subjects asked more questions and made fewer decisions (then the less successful subjects) in the early time periods of the sim.

Predict and extrapolate

Once a model is formulated, the decision maker must attempt to determine how the values of variables will change over time in response to his decisions or internal system dynamics. One problem is predicting that outputs will change in a linear fashion, even as the evidence grows for a non-linear, e.g., exponential function. An exponential variable may suddenly grow dramatically then equally suddenly reverse course when the limits on growth (resources) are reached. Internal time delays mean that the effects of a decision are not visible until some time in the future. Faced with poor results, unsuccessful test subjects implement or exhibit “massive countermeasures, ad hoc hypotheses that ignore the actual data, underestimations of growth processes, panic reactions, and ineffectual frenetic activity.” (p. 152) Successful subjects made an effort to understand the system's dynamics, kept notes (history) on system performance and tried to anticipate what would happen in the future.

Plan and execute actions, check results and adjust strategy

The essence of planning is to think through the consequences of certain actions and see whether those actions will bring us closer to our desired goal.” (p. 153) Easier said than done in an environment of too many alternative courses of action and too little time. In rapidly evolving situations, it may be best to create rough plans and delegate as many implementing decisions as possible to subordinates. A major risk is thinking that planning has been so complete than the unexpected cannot occur. A related risk is the reflexive use of historically successful strategies. “As at Chernobyl, certain actions carried out frequently in the past, yielding only the positive consequences of time and effort saved and incurring no negative consequences, acquire the status of an (automatically applied) ritual and can contribute to catastrophe.” (p. 172)

In the sims, unsuccessful test subjects often exhibited “ballistic” behavior—they implemented decisions but paid no attention to, i.e, did not learn from, the results. Successful subjects watched for the effects of their decisions, made adjustments and learned from their mistakes.

Dörner identified several characteristics of people who tended to end up in a failure situation. They failed to formulate their goals, didn't recognize goal conflict or set priorities, and didn't correct their errors. (p. 185) Their ignorance of interrelationships among system variables and the longer-term repercussions of current decisions set the stage for ultimate failure.

Assessment

Dörner's insights and models have informed our thinking about human decision-making behavior in demanding, complicated situations. His use and promotion of simulation models as learning tools was one starting point for Bob Cudlin's work in developing a nuclear management training simulation program. Like Dörner, we see simulation as a powerful tool to “observe and record the background of planning, decision making, and evaluation processes that are usually hidden.” (pp. 9-10)

However, this book does not cover the entire scope of our interests. Dörner is a psychologist interested in individuals, group behavior is beyond his range. He alludes to normalization of deviance but his references appear limited to the flaunting of safety rules rather than a more pervasive process of slippage. More importantly, he does not address behavior that arises from the system itself, in particular adaptive behavior as an open system reacts to and interacts with its environment.

From our view, Dörner's suggestions may help the individual decision maker avoid common pitfalls and achieve locally optimum answers. On the downside, following Dörner's prescription might lead the decision maker to an unjustified confidence in his overall system management abilities. In a truly complex system, no one knows how the entire assemblage works. It's sobering to note that even in Dörner's closed,** relatively simple models many test subjects still had a hard time developing a reasonable mental model, and some failed completely.

This book is easy to read and Dörner's insights into the psychological traps that limit human decision making effectiveness remain useful.


* D. Dörner, The Logic of Failure: Recognizing and Avoiding Error in Complex Situations, trans. R. and R. Kimber (Reading, MA: Perseus Books, 1998). Originally published in German in 1989.

** One simulation model had an external input.

Wednesday, December 12, 2012

“Overpursuit” of Goals

We return to a favorite subject, the impact of goals and incentives on safety culture and performance. Interestingly this subject comes up in an essay by Oliver Burkeman, “The Power of Negative Thinking,”* which may seem unusual as most people think of goals and achievement of goals as the product of a positive approach. Traditional business thinking is to set hard, quantitative goals, the bigger the better. But futures are inherently uncertain and goals generally are not so. The counter intuitive argument suggests the most effective way to address future performance is to focus on worst case outcomes. Burkeman observes that “...rigid goals may encourage employees to cut ethical corners” and “Focusing on one goal at the expense of all other factors also can distort a corporate mission or an individual life…” and result in “...the ‘overpursuit’ of goals…” Case in point, yellow jerseys.

This raises some interesting points for nuclear safety. First we would remind our readers of Snowden’s Cynefin decision context framework, specifically his “complex” space which is indicative of where nuclear safety decisions reside. In this environment there are many interacting causes and effects, making it difficult or impossible to pursue specific goals along defined paths. Clearly an uncertain landscape. As Simon French argues: “Decision support will be more focused on exploring judgement and issues, and on developing broad strategies that are flexible enough to accommodate changes as the situation evolves.”** This would suggest the pursuit of specific, aspirational goals may be misguided or counterproductive.

Second, safety performance goals are hard to identify anyway. Is it the absence of bad outcomes? Or the maintenance of, say, a “strong” safety culture - whatever that is. One indication of the elusiveness of safety goals is their absence as targets in incentive programs. So there is probably little likelihood of overemphasizing safety performance as a goal. But is the same true for operational type goals such as capacity factor, refuel outage durations, and production costs? Can an overly strong focus on such short term goals, often associated with stretching performance, lead to overpursuit? What if large financial incentives are attached to the achievement of the goals?

The answer is not: “Safety is our highest priority”. More likely it is an approach that considers the complexity and uncertainty of nuclear operating space and the potential for hard goals to cut both ways. It might value how a management team prosecutes its responsibilities more than the outcome itself.


* O. Burkeman, “The Power of Negative Thinking,” Wall Street Journal online (Dec. 7, 2012).

** S. French, “Cynefin: repeatability, science and values,” Newsletter of the European Working Group “Multiple Criteria Decision Aiding,” series 3, no. 17 (Spring 2008) p. 2. We posted on Cynefin and French's paper here.

Wednesday, December 5, 2012

Drift Into Failure by Sydney Dekker

Sydney Dekker's Drift Into Failure* is a noteworthy effort to provide new insights into how accidents and other bad outcomes occur in large organizations. He begins by describing two competing world views, the essentially mechanical view of the world spawned by Newton and Descartes (among others), and a view based on complexity in socio-technical organizations and a systems approach. He shows how each world view biases the search for the “truth” behind how accidents and incidents occur.

Newtonian-Cartesian (N-C) Vision

Issac Newton and Rene Descartes were leading thinkers during the dawn of the Age of Reason. Newton used the language of mathematics to describe the world while Descartes relied on the inner process of reason. Both believed there was a single reality that could be investigated, understood and explained through careful analysis and thought—complete knowledge was possible if investigators looked long and hard enough. The assumptions and rules that started with them, and were extended by others over time, have been passed on and most of us accept them, uncritically, as common sense, the most effective way to look at the world.

The N-C world is ruled by invariant cause-and-effect; it is, in fact, a machine. If something bad happens, then there was a unique cause or set of causes. Investigators search for these broken components, which could be physical or human. It is assumed that a clear line exists between the broken part(s) and the overall behavior of the system. The explicit assumption of determinism leads to an implicit assumption of time reversibility—because system performance can be predicted from time A if we know the starting conditions and the functional relationships of all components, then we can start from a later time B (the bad outcome) and work back to the true causes. (p. 84) Root cause analysis and criminal investigations are steeped in this world view.

In this view, decision makers are expected to be rational people who “make decisions by systematically and consciously weighing all possible outcomes along all relevant criteria.” (p. 3) Bad outcomes are caused by incompetent or worse, corrupt decision makers. Fixes include more communications, training, procedures, supervision, exhortations to try harder and criminal charges.

Dekker credits Newton et al for giving man the wherewithal to probe Nature's secrets and build amazing machines. However, Newtonian-Cartesian vision is not the only way to view the world, especially the world of complex, socio-technical systems. For that a new model, with different concepts and operating principles, is required.

The Complex System

Characteristics

The sheer number of parts does not make a system complex, only complicated. A truly complex system is open (it interacts with its environment), has components that act locally and don't know the full effects of their actions, is constantly making decisions to maintain performance and adapt to changing circumstances, and has non-linear interactions (small events can cause large results) because of multipliers and feedback loops. Complexity is a result of the ever-changing relationships between components. (pp.138-144)

Adding to the myriad information confronting a manager or observer, system performance is often optimized at the edge of chaos, where competitors are perpetually vying for relative advantage at an affordable cost.** The system is constantly balancing its efforts between exploration (which will definitely incur costs but may lead to new advantages) and exploitation (which reaps benefits of current advantages but will likely dissipate over time). (pp. 164-165)

The most important feature of a complex system is that it adapts to its environment over time in order to survive. And its environment is characterized by resource scarcity and competition. There is continuous pressure to maintain production and increase efficiency (and their visible artifacts: output, costs, profits, market share, etc) and less visible outputs, e.g., safety, will receive less attention. After all, “Though safety is a (stated) priority, operational systems do not exist to be safe. They exist to provide a service or product . . . .” (p. 99) And the cumulative effect of multiple adaptive decisions can be an erosion of safety margins and a changed response of the entire system. Such responses may be beneficial or harmful—a drift into failure.

Drift by a complex system exhibits several characteristics. First, as mentioned above, it is driven by environmental factors. Second, drift occurs in small steps so changes can be hardly noticed, and even applauded if they result in local performance improvement; “. . . successful outcomes keep giving the impression that risk is under control” (p. 106) as a series of small decisions whittle away at safety margins. Third, these complex systems contain unruly technology (think deepwater drilling) where uncertainties exist about how the technology may be ultimately deployed and how it may fail. Fourth, there is significant interaction with a key environmental player, the regulator, and regulatory capture can occur, resulting in toothless oversight.

“Drifting into failure is not so much about breakdowns or malfunctioning of components, as it is about an organization not adapting effectively to cope with the complexity of its own structure and environment.” (p. 121) Drift and occasionally accidents occur because of ordinary system functioning, normal people going about their regular activities making ordinary decisions “against a background of uncertain technology and imperfect information.” Accidents, like safety, can be viewed as an emergent system property, i.e., they are the result of system relationships but cannot be predicted by examining any particular system component.

Managers' roles

Managers should not try to transform complex organizations into merely complicated ones, even if it's possible. Complexity is necessary for long-term survival as it maximizes organizational adaptability. The question is how to manage in a complex system. One key is increasing the diversity of personnel in the organization. More diversity means less group think and more creativity and greater capacity for adaptation. In practice, this means validation of minority opinions and encouragement of dissent, reflecting on the small decisions as they are made, stopping to ponder why some technical feature or process is not working exactly as expected and creating slack to reduce the chances of small events snowballing into large failures. With proper guidance, organizations can drift their way to success.

Accountability

Amoral and criminal behavior certainly exist in large organizations but bad outcomes can also result from normal system functioning. That's why the search for culprits (bad actors or broken parts) may not always be appropriate or adequate. This is a point Dekker has explored before, in Just Culture (briefly reviewed here) where he suggests using accountability as a means to understand the system-based contributors to failure and resolve those contributors in a manner that will avoid recurrence.

Application to Nuclear Safety Culture

A commercial nuclear power plant or fleet is probably not a complete complex system. It interacts with environmental factors but in limited ways; it's certainly not directly exposed to the Wild West competition of say, the cell phone industry. Group think and normalization of deviance*** is a constant threat. The technology is reasonably well-understood but changes, e.g., uprates based on more software-intensive instrumentation and control, may be invisibly sanding away safety margin. Both the industry and the regulator would deny regulatory capture has occurred but an outside observer may think the relationship is a little too cozy. Overall, the fit is sufficiently good that students of safety culture should pay close attention to Dekker's observations.

In contrast, the Hanford Waste Treatment Plant (Vit Plant) is almost certainly a complex system and this book should be required reading for all managers in that program.

Conclusion

Drift Into Failure is not a quick read. Dekker spends a lot of time developing his theory, then circling back to further explain it or emphasize individual pieces. He reviews incidents (airplane crashes, a medical error resulting in patient death, software problems, public water supply contamination) and descriptions of organization evolution (NASA, international drug smuggling, “conflict minerals” in Africa, drilling for oil, terrorist tactics, Enron) to illustrate how his approach results in broader and arguably more meaningful insights than the reports of official investigations. Standing on the shoulders of others, especially Diane Vaughan, Dekker gives us a rich model for what might be called the “banality of normalization of deviance.” 


* S. Dekker, Drift Into Failure: From Hunting Broken Components to Understanding Complex Systems (Burlington VT: Ashgate 2011).

** See our Sept. 4, 2012 post onCynefin for another description of how the decisions an organization faces can suddenly slip from the Simple space to the Chaotic space.

*** We have posted many times about normalization of deviance, the corrosive organizational process by which the yesterday's “unacceptable” becomes today's “good enough.”