Friday, March 14, 2014

Deficient Safety Culture at Metro-North Railroad

A new Federal Railroad Administration (FRA) report* excoriates the safety performance of the Metro-North Commuter Railroad which serves New York, Connecticut and New Jersey.  The report highlights problems in the Metro-North safety culture (SC), calling it “poor”, “deficient” and “weak”.  Metro-North’s fundamental problem, which we have seen elsewhere, is putting production ahead of safety.  The report’s conclusion concisely describes the problem: “The findings of Operation Deep Dive demonstrate that Metro-North has emphasized on-time performance to the detriment of safe operations and adequate maintenance of its infrastructure. This led to a deficient safety culture that has manifested itself in increased risk and reduced safety on Metro-North.” (p. 4)

The proposed fixes are likewise familiar: “. . . senior leadership must prioritize safety above all else, and communicate and implement that priority throughout Metro-North. . . . submit to FRA a plan to improve the Safety Department’s mission and effectiveness. . . . [and] submit to FRA a plan to improve the training program. (p. 4)**

Our Perspective 


This report is typical.  It’s not bad, but it’s incomplete and a bit misguided.

The directive for senior management to establish safety as the highest priority and implement that priority is good but incomplete.  There is no discussion of how safety is or should be appropriately considered in decision-making throughout the agency, from its day-to-day operations to strategic considerations.  More importantly, Metro-North’s recognition, reward and compensation practices (keys to shaping behavior at all organizational levels) are not even mentioned.

The Safety Department discussion is also incomplete and may lead to incorrect inferences.  The report says “Currently, no single department or office, including the Safety Department, proactively advocates for safety, and there is no effort to look for, identify, or take ownership of safety issues across the operating departments. An effective Safety Department working in close communication and collaboration with both management and employees is critical to building and maintaining a good safety culture on any railroad.” (p. 13)  A competent Safety Department is certainly necessary to create a hub for safety-related problems but is not sufficient.  In a strong SC, the “effort to look for, identify, or take ownership of safety issues” is everyone’s responsibility.  In addition, the authors don’t appear to appreciate that SC is part of a loop—the deficiencies described in the report certainly influence SC, but SC provides the context for the decision-making that currently prioritizes on-time performance over safety.

Metro-North training is fragmented across many departments and the associated records system is problematic.  The proposed fix focuses on better organization of the training effort.  There is no mention of the need for training content to include any mention of safety or SC.

Not included in the report (but likely related to it) is that Metro-North’s president retired last January.  His replacement says Metro-North is implementing “aggressive actions to affirm that safety is the most important factor in railroad operations.”***

We have often griped about SC assessments where the recommended corrective actions are limited to more training, closer oversight and selective punishment.  How did the FRA do?   


*  Federal Railroad Administration, “Operation Deep Dive Metro-North Commuter Railroad Safety Assessment” (Mar. 2014).  Retrieved Mar. 14, 2014.  The FRA is an agency in the U.S. Department of Transportation.

**  The report also includes a laundry list of negative findings and required/recommended corrective actions in several specific areas.

***  M. Flegenheimer, “Report Finds Punctuality Trumps Safety at Metro-North,” New York Times (Mar. 14, 2014).  Retrieved Mar. 14, 2014)

Thursday, March 13, 2014

Eliminate the Bad Before Attempting the Good

An article* in the McKinsey Quarterly suggests executives work at rooting out destructive behaviors before attempting to institute best practices.  The reason is simple: “research has found that negative interactions with bosses and coworkers [emphasis added] have five times more impact than positive ones.” (p. 81)  In other words, a relatively small amount of bad behavior can keep good behavior, i.e., improvements, from taking root.**  The authors describe methods for removing bad behavior and warning signs that such behavior exists.  This post focuses on their observations that might be useful for nuclear managers and their organizations.

Methods

Nip Bad Behavior in the Bud — Bosses and coworkers should establish zero tolerance for bad behavior but feedback or criticism should be delivered while treating the target employee with respect.  This is not about creating a climate of fear, it’s about seeing and responding to a “broken window” before others are also broken.  We spoke a bit about the broken window theory here.

Put Mundane Improvements Before Inspirational Ones/Seek Adequacy Before Excellence — Start off with one or more meaningful objectives that the organization can achieve in the short term without transforming itself.  Recognize and reward positive behavior, then build on successes to introduce new values and strategies.  Because people are more than twice as likely to complain about bad customer service as to mention good customer service, management intervention should initially aim at getting the service level high enough to staunch complaints, then work on delighting customers.

Use Well-Respected Staff to Squelch Bad Behavior — Identify the real (as opposed to nominal or official) group leaders and opinion shapers, teach them what bad looks like and recruit them to model good behavior.  Sounds like co-opting (a legitimate management tool) to me.

Warning Signs

Fear of Responsibility — This can be exhibited by employees doing nothing rather than doing the right thing, or their ubiquitous silence.  It is related to bystander behavior, which we posted on here.

Feelings of Injustice or Helplessness — Employees who believe they are getting a raw deal from their boss or employer may act out, in a bad way.  Employees who believe they cannot change anything may shirk responsibility.

Feelings of Anonymity — This basically means employees will do what they want because no one is watching.  This could lead to big problems in nuclear plants because they depend heavily on self-management and self-reporting of problems at all organizational levels.  Most of the time things work well but incidents, e.g., falsification of inspection reports or test results, do occur.

Our Perspective

The McKinsey Quarterly is a forum for McKinsey people and academics whose work has some practical application.  This article is not rocket science but sometimes a simple approach can help us appreciate basic lessons.  The key takeaway is that an overconfident new manager can sometimes reach too far, and end up accomplishing very little.  The thoughtful manager might spend some time figuring out what’s wrong (the “bad” behavior) and develop a strategy for eliminating it and not simply pave over it with a “get better” program that ignores underlying, systemic issues.  Better to hit a few singles and get the bad juju out of the locker room before swinging for the fences.


*  H. Rao and R.I. Sutton, “Bad to great: The path to scaling up excellence,” McKinsey Quarterly, no. 1 (Feb. 2014), pp. 81-91.  Retrieved Mar. 13, 2014.

**  Even Machiavelli recognized the disproportionate impact of negative interactions.  “For injuries should be done all together so that being less tasted they will give less offence.  Benefits should be granted little by little so that they may be better enjoyed.”  The Prince, ch. VIII.

Tuesday, March 4, 2014

Declining Safety Culture at the Waste Isolation Pilot Plant?

DOE WIPP
Here’s another nuclear-related facility you may or may not know about: The Department of Energy’s (DOE) Waste Isolation Pilot Plant (WIPP) located near Carlsbad, NM.  WIPP’s mission is to safely dispose of defense-related transuranic radioactive waste.  “Transuranic” refers to man-made elements that are heavier than uranium; in DOE’s waste the most prominent of these elements is plutonium but waste also includes others, e.g., americium.*

Recently there have been two incidents at WIPP.  On Feb. 5, 2014 a truck hauling salt underground caught fire.  There was no radiation exposure associated with this incident.  But on Feb. 14, 2014 a radiation alert activated in the area where newly arrived waste was being stored.  Preliminary tests showed thirteen workers suffered some radiation exposure.


It will come as no surprise to folks associated with nuclear power plants that WIPP opponents have amped up after these incidents.  For our purposes, the most interesting quote comes from Don Hancock of the Southwest Research and Information Center: “I’d say the push for expansion is part of the declining safety culture that has resulted in the fire and the radiation release.”  Not surprisingly, WIPP management disputes that view.**


Our Perspective


So, are these incidents an early signal of a nascent safety culture (SC) problem?  After all, SC issues are hardly unknown at DOE facilities.  Or is the SC claim simply the musing of an opportunistic anti?  Who knows.  At this point, there is insufficient information available to say anything about WIPP’s SC.  However, we’ll keep an eye on this situation.  A bellwether event would be if the Defense Nuclear Facilities Safety Board decides to get involved.



See the WIPP and Environmental Protection Agency (EPA) websites for project information.  If the WIPP site is judged suitable, the underground storage area is expected to expand to 100 acres.

The EPA and the New Mexico Environmental Department have regulatory authority over WIPP.  The NRC has regulatory authority over the containers used to ship waste.  See National Research Council, “Improving the Characterization Program for Contact-Handled Transuranic Waste Bound for the Waste Isolation Pilot Plant” (Washington, DC: The National Academies Press, 2004), p. 27.


**  J. Clausing, “Nuclear dump leak raises questions about cleanup,” Las Vegas Review-Journal (Mar. 1, 2014).  Retrieved Mar. 3, 2014.

Wednesday, February 12, 2014

Left Brain, Right Stuff: How Leaders Make Winning Decisions by Phil Rosenzweig

In this new book* Rosenzweig extends the work of Kahneman and other scholars to consider real-world decisions.  He examines how the content and context of such decisions is significantly different from controlled experiments in a decision lab.  Note that Rosenzweig’s advice is generally aimed at senior executives, who typically have greater latitude in making decisions and greater responsibility for achieving results than lower-level professionals, but all managers can benefit from his insights.  This review summarizes the book and explores its lessons for nuclear operations and safety culture. 

Real-World Decisions

Decision situations in the real world can be more “complex, consequential and laden with uncertainty” than those described in laboratory experiments. (p. 6)  A combination of rigorous analysis (left brain) and ambition (the right stuff—high confidence and a willingness to take significant risks) is necessary to achieve success. (pp. 16-18)  The executive needs to identify the important characteristics of the decision he is facing.  Specifically,

Can the outcome following the decision be influenced or controlled?

Some real-world decisions cannot be controlled, e.g., the price of Apple stock after you buy 100 shares.  In those situations the traditional advice to decision makers, viz., be rational, detached, analyze the evidence and watch out for biases, is appropriate. (p. 32)

But for many decisions, the executive (or his team) can influence outcomes through high (but not excessive) confidence, positive illusions, calculated risks and direct action.  The knowledgeable executive understands that individuals perceived as good executives exhibit a bias for action and “The essence of management is to exercise control and influence events.” (p. 39)  Therefore, “As a rule of thumb, it's better to err on the side of thinking we can get things done rather than assuming we cannot.  The upside is greater and the downside less.” (p. 43)

Think about your senior managers.  Do they under or over-estimate their ability to influence future performance through their decisions?

Is the performance based on the decision(s) absolute or relative?

Absolute performance is described using some system of measurement, e.g., how many free throws you make in ten attempts or your batting average over a season.  It is not related to what anyone else does. 

But in competition performance is relative to rivals.  Ten percent growth may not be sufficient if a rival grows fifty percent.**  In addition, payoffs for performance may be highly skewed: in the Olympics, there are three medals and the others get nothing; in many industries, the top two or three companies make money, the others struggle to survive; in the most extreme case, it's winner take all and the everyone else gets nothing.  It is essential to take risks to succeed in highly skewed competitive situations.

Absolute and relative performance may be connected.  In some cases, “a small improvement in absolute performance can make an outsize difference in relative performance, . . .” (p. 66)  For example, if a well-performing nuclear plant can pick up a couple percentage points of annual capacity factor (CF), it can make a visible move up the CF rankings thus securing bragging rights (and possibly bonuses) for its senior managers.

For a larger example, remember when the electricity markets deregulated and many utilities rushed to buy or build merchant plants?  Note how many have crawled back under the blanket of regulation where they only have to demonstrate prudence (a type of absolute performance) to collect their guaranteed returns, and not compete with other sellers.  In addition, there is very little skew in the regulated performance curve; even mediocre plants earn enough to carry on their business.  Lack of direct competition also encourages sharing information, e.g., operating experience in the nuclear industry.  If competition is intense, sharing information is irresponsible and possibly dangerous to one's competitive position. (p. 61)

Do your senior managers compare their performance to some absolute scale, to other members of your fleet (if you're in one), to similar plants, to all plants, or the company's management compensation plan?

Will the decision result in rapid feedback and be repeated or is it a one-off or will it take a long time to see results? 


Repetitive decisions, e.g., putting at golf, can benefit from deliberate practice, where performance feedback is used to adjust future decisions (action, feedback, adjustment, action).  This is related to the extensive training in the nuclear industry and the familiar do, check and adjust cycle ingrained in all nuclear workers.

However, most strategic decisions are unique or have consequences that will only manifest in the long-term.  In such cases, one has to make the most sound decision possible then take the best shot. 

Executives Make Decisions in a Social Setting

Senior managers depend on others to implement decisions and achieve results.  Leadership (exaggerated confidence, emphasizing certain data and beliefs over others, consistency, fairness and trust is indispensable to inspire subordinates and shape culture.  Quoting Jack Welch, “As a leader, your job is to steer and inspire.” (p. 146)  “Effective leadership . . . means being sincere to a higher purpose and may call for something less than complete transparency.” (p. 158)

How about your senior managers?  Do they tell the whole truth when they are trying to motivate the organization to achieve performance goals?  If not, how does that impact trust over the long term?  
    
The Role of Confidence and Overconfidence

There is a good discussion of the overuse of the term “overconfidence,” which has multiple meanings but whose meaning in a specific application is often undefined.  For example, overconfidence can refer to being too certain that our judgment is correct, believing we can perform better than warranted by the facts (absolute performance) or believing we can outperform others (relative performance). 

Rosenzweig conducted some internet research on overconfidence.  The most common use in the business press was to explain, after the fact, why something had gone wrong. (p. 85)  “When we charge people with overconfidence, we suggest that they contributed to their own demise.” (p. 87)  This sounds similar to the search for the “bad apple” after an incident occurs at a nuclear plant.

But confidence is required to achieve high performance.  “What's the best level of confidence?  An amount that inspires us to do our best, but not so much that we become complacent, or take success for granted, or otherwise neglect what it takes to achieve high performance.” (p. 95)

Other Useful Nuggets

There is a good extension of the discussion (introduced in Kahneman) of base rates and conditional probabilities including the full calculations from two of the conditional probability examples in Kahneman's Thinking, Fast and Slow (reviewed here).

The discussion on decision models notes that such models can be useful for overcoming common biases, analyzing large amounts of data and predicting elements of the future beyond our influence.  However, if we have direct influence, “Our task isn't to predict what will happen, but to make it happen.” (p. 189)

Other chapters cover decision making in a major corporate acquisition (focusing on bidding strategy) and in start-up businesses (focusing on a series of start-up decisions)

Our Perspective

Rosenzweig acknowledges that he is standing on the shoulders of Kahneman and others students of decision making.  But “An awareness of common errors and cognitive biases is only a start.” (p. 248)  The executive must consider the additional decision dimensions discussed above to properly frame his decision; in other words, he has to decide what he's deciding.

The direct applicability to nuclear safety culture may seem slight but we believe executives' values and beliefs, as expressed in the decisions they make over time, provide a powerful force on the shape and evolution of culture.  In other words, we choose to emphasize the transactional nature of leadership.  In contrast, Rosenzweig emphasizes its transformational nature: “At its core, however, leadership is not a series of discrete decisions, but calls for working through other people over long stretches of time.” (p. 164)  Effective leaders are good at both.

Of course, decision making and influence on culture is not the exclusive province of senior managers.  Think about your organization's middle managers—the department heads, program and project managers, and process owners.  How do they gauge their performance?  How open are they to new ideas and approaches?  How much confidence do they exhibit with respect to their own capabilities and the capabilities of those they influence? 

Bottom line, this is a useful book.  It's very readable, with many clear and engaging examples,  and has the scent of academic rigor and insight; I would not be surprised if it achieves commercial success.


*  P. Rosenzweig, Left Brain, Right Stuff: How Leaders Make Winning Decisions (New York: Public Affairs, 2014).

**  Referring to Lewis Carroll's Through the Looking Glass, this situation is sometimes called “Red Queen competition [which] means that a company can run faster but fall further behind at the same time.” (p. 57)

Tuesday, January 21, 2014

Lessons Learned from “Lessons Learned”: The Evolution of Nuclear Power Safety after Accidents and Near-Accidents by Blandford and May

This publication appeared on a nuclear safety online discussion board.*  It is a high-level review of significant commercial nuclear industry incidents and the subsequent development and implementation of related lessons learned.  This post summarizes and evaluates the document then focuses on its treatment of nuclear safety culture (SC). 

The authors cover Three Mile Island (1979), Chernobyl (1986), Le Blayais [France] plant flooding (1999), Davis-Besse (2002), U.S. Northeast Blackout (2003) and Fukushima-Daiichi (2011).  There is a summary of each incident followed by the major lessons learned, usually gleaned from official reports on the incident. 

Some lessons learned led to significant changes in the nuclear industry, other lessons learned were incompletely implemented or simply ignored.  In the first category, the creation of INPO (Institute of Nuclear Power Operations) after TMI was a major change.**  On the other hand, lessons learned from Chernobyl were incompletely implemented, e.g., WANO (World Association of Nuclear Operators, a putative “global INPO”) was created but it has no real authority over operators.  Fukushima lessons learned have focused on design, communication, accident response and regulatory deficiencies; implementation of any changes remains a work in progress.

The authors echo some concerns we have raised elsewhere on this blog.  For example, they note “the likelihood of a rare external event at some site at some time over the lifetime of a reactor is relatively high.” (p. 16)  And “the industry should look at a much higher probability of problems than is implied in the “once in a thousand years” viewpoint.” (p. 26)  Such cautions are consistent with Taleb's and Dédale's warnings that we have discussed here and here.

The authors also say “Lessons can also be learned from successes.” (p. 3)  We agree.  That's why our recommendation that managers conduct periodic in-depth analyses of plant decisions includes decisions that had good outcomes, in addition to those with poor outcomes.

Arguably the most interesting item in the report is a table that shows deaths attributable to different types of electricity generation.  Death rates range from 161 (per TWh) for coal to 0.04 for nuclear.  Data comes from multiple sources and we made no effort to verify the analysis.***

On Safety Culture

The authors say “. . . a culture of safety must be adopted by all operating entities. For this to occur, the tangible benefits of a safety culture must become clear to operators.” (p. 2, repeated on p. 25)  And “The nuclear power industry has from the start been aware of the need for a strong and continued emphasis on the safety culture, . . .” (p. 24)  That's it for the direct mention of SC.

Such treatment is inexcusably short shrift for SC.  There were obvious, major SC issues at many of the plants the authors discuss.  At Chernobyl, the culture permitted, among other things, testing that violated the station's own safety procedures.  At Davis-Besse, the culture prioritized production over safety—a fact the authors note without acknowledging its SC significance.  The combination of TEPCO's management culture which simply ignored inconvenient facts and their regulator's “see no evil” culture helped turn a significant plant event at Fukushima into an abject disaster.

Our Perspective


It's not clear who the intended audience is for this document.  It was written by two professors under the aegis of the American Academy of Arts and Sciences, an organization that, among other things, “provides authoritative and nonpartisan policy advice to decision-makers in government, academia, and the private sector.”****  While it is a nice little history paper, I can't see it moving the dial in any public policy discussion.  The scholarship in this article is minimal; it presents scant analysis and no new insights.  Its international public policy suggestions are shallow and do not adequately recognize disparate, even oppositional, national interests.  Perhaps you could give it to non-nuclear folks who express interest in the unfavorable events that have occurred in the nuclear industry. 


*  E.D. Blandford and M.M. May, “Lessons Learned from “Lessons Learned”: The Evolution of Nuclear Power Safety after Accidents and Near-Accidents” (Cambridge, MA: American Academy of Arts and Sciences, 2012).  Thanks to Madalina Tronea for publicizing this article on the LinkedIn Nuclear Safety group discussion board.  Dr. Tronea is the group's founder/moderator.

**  This publication is a valentine for INPO and, to a lesser extent, the U.S. nuclear navy.  INPO is hailed as “extraordinarily effective” (p. 12) and “a well-balanced combination of transparency and privacy; . . .” (p. 25)

***  It is the only content that demonstrates original analysis by the authors.

****  American Academy of Arts and Sciences website (retrieved Jan. 20, 2014).

Thursday, January 9, 2014

Safety Culture Training Labs

Not a SC Training Lab
This post highlights a paper* Carlo Rusconi presented at the American Nuclear Society meeting last November.  He proposes the use of “training labs” to develop improved safety culture (SC) through the use of team-building exercises, e.g., role play, and table-top simulations.  Team building increases (a) participants' awareness of group dynamics, e.g., feedback loops, and how a group develops shared beliefs and (b) sensitivity to the viewpoints of others, viewpoints that may differ greatly based on individual experience and expectations.  The simulations pose evolving scenarios that participants must analyze and develop a team approach for addressing.  A key rationale for this type of training is “team interactions, if properly developed and trained, have the capacity to counter-balance individual errors.” (p. 2155)

Rusconi's recognition of goal conflict in organizations, the weakness of traditional methods (e.g., PRA) for anticipating human reactions to emergent issues, the need to recognize different perspectives on the same problem and the value of simulation in training are all familiar themes here at Safetymatters.

Our Perspective

Rusconi's work also reminds us how seldom new approaches for addressing SC concepts, issues, training and management appear in the nuclear industry.  Per Rusconi, “One of the most common causes of incidents and accidents in the industrial sector is the presence of hidden or clear conflicts in the organization. These conflicts can be horizontal, in departments or in working teams, or vertical, between managers and workers.” (p. 2156)  However, we see scant evidence of the willingness of the nuclear industry to acknowledge and address the influence of goal conflicts.

Rusconi focuses on training to help recognize and overcome conflicts.  This is good but one needs to be careful to clearly identify how training would do this and its limitations. For example, if promotion is impacted by raising safety issues or advocating conservative responses, is training going to be an effective remedy?  The truth is there are some conflicts which are implicit (but very real) and hard to mitigate. Such conflicts can arise from corporate goals, resource allocation policies and performance-based executive compensation schemes.  Some of these conflicts originate high in the organization and are not really amenable to training per se.

Both Rusconi's approach and our NuclearSafetySim tool attempt to stimulate discussion of conflicts and develop rules for resolving them.  Creating a measurable framework tied to the actual decisions made by the organization is critical to dealing with conflicts.  Part of this is creating measures for how well decisions embody SC, as done in NuclearSafetySim.

Perhaps this means the only real answer for high risk industries is to have agreement on standards for safety decisions.  This doesn't mean some highly regimented PRA-type approach.  It is more of a peer type process incorporating scales for safety significance, decision quality, etc.  This should be the focus of the site safety review committees and third-party review teams.  And the process should look at samples of all decisions not just those that result in a problem and wind up in the corrective action program (CAP).

Nuclear managers would probably be very reluctant to embrace this much transparency.  A benign view is they are simply too comfortable believing that the "right" people will do the "right" thing.  A less charitable view is their lack of interest in recognizing goal conflicts and other systemic issues is a way to effectively deny such issues exist.

Instead of interest in bigger-picture “Why?” questions we see continued introspective efforts to refine existing methods, e.g., cause analysis.  At its best, cause analysis and any resultant interventions can prevent the same problem from recurring.  At its worst, cause analysis looks for a bad component to redesign or a “bad apple” to blame, train, oversee and/or discipline.

We hate to start the new year wearing our cranky pants but Dr. Rusconi, ourselves and a cadre of other SC analysts are all advocating some of the same things.  Where is any industry support, dialogue, or interaction?  Are these ideas not robust?  Are there better alternatives?  It is difficult to understand the lack of engagement on big-picture questions by the industry and the regulator.


*  C. Rusconi, “Training labs: a way for improving Safety Culture,” Transactions of the American Nuclear Society, Vol. 109, Washington, D.C., Nov. 10–14, 2013, pp. 2155-57.  This paper reflects a continuation of Dr. Rusconi's earlier work which we posted on last June 26, 2013.

Wednesday, December 18, 2013

Thinking, Fast and Slow by Daniel Kahneman

Kahneman is a Nobel Prize winner in economics.  His focus is on personal decision making, especially the biases and heuristics used by the unconscious mind as it forms intuitive opinions.  Biases lead to regular (systematic) errors in decision making.  Kahneman and Amos Tversky developed prospect theory, a model of choice, that helps explain why real people make decisions that are different from those of the rational man of economics.

Kahneman is a psychologist so his work focuses on the individual; many of his observations are not immediately linkable to safety culture (a group characteristic).  But even in a nominal group setting, individuals are often very important.  Think about the lawyers, inspectors, consultants and corporate types who show up after a plant incident.  What kind of biases do they bring to the table when they are evaluating your organization's performance leading up to the incident?

The book* has five parts, described below.  Kahneman reports on his own research and then adds the work of many other scholars.  Many of the experiments appear quite simple but provide insights into unconscious and conscious decision making.  There is a lot of content so this is a high level summary, punctuated by explicative or simply humorous quotes.

Part 1 describes two methods we use to make decisions: System 1 and System 2.  System 1 is impulsive, intuitive, fast and often unconscious; System 2 is more analytic, cautious, slow and controlled. (p. 48)  We often defer to System 1 because of its ease of use; we simply don't have the time, energy or desire to pore over every decision facing us.  Lack of desire is another term for lazy.

System 1 often operates below consciousness, utilizing associative memory to link a current stimulus to ideas or concepts stored in memory. (p. 51)  System 1's impressions become beliefs when accepted by System 2 and a mental model of the world takes shape.  System 1 forms impressions of familiarity and rapid, precise intuitions then passes them on to System 2 to accept/reject. (pp. 58-62)

System 2 activities take effort and require attention, which is a finite resource.  If we exceed the attention budget or become distracted then System 2 will fail to obtain correct answers.  System 2 is also responsible for self-control of thoughts and behaviors, another drain on mental resources. (pp. 41-42)

Biases include a readiness to infer causality, even where none exists; a willingness to believe and confirm in the absence of solid evidence; succumbing to the halo effect where we project a coherent whole based on an initial impression; and problems caused by WYSIATI** including basing conclusions on limited evidence, overconfidence, framing effects where decisions differ depending on how information and questions are presented and base-rate neglect where we ignore widely-known data about a decision situation. (pp. 76-88)

Heuristics include substituting easier questions for the more difficult ones that have been asked, letting current mood affect answers on general happiness and allowing emotions to trump facts. (pp. 97-103) 

Part 2 explores decision heuristics in greater detail, with research and examples of how we think associatively, metaphorically and causally.  A major topic throughout this section is the errors people tend to make when handling questions that have a statistical dimension.  Such errors occur because statistics requires us to think of many things at once, which System 1 is not designed to do, and a lazy or busy System 2, which could handle this analysis, is prone to accept System 1's proposed answer.  Other errors occur because:

We make incorrect inferences from small samples and are prone to ascribe causality to chance events.  “We are far too willing the reject the belief that much of what we in life is random.” (p. 117)  We are prone to attach “a causal interpretation to the inevitable fluctuations of a random process.” (p. 176)  “There is more luck in the outcomes of small samples.” (p. 194)

We fall for the anchoring effect, where we see a particular value for an unknown quantity (e.g., the asking price for a used car) before we develop our own value.  Even random anchors, which provide no relevant information, can influence decision making.

People search for relevant information when asked questions.  Information availability and ease of retrieval is a System 1 heuristic but only System 2 can judge the quality and relevance of retrieved content.  People are more strongly affected by ease of retrieval and go with their intuition when they are, for example, mentally busy or in a good mood. (p. 135)  However, “intuitive predictions tend to be overconfident and overly extreme.” (p. 192)

Unless we know the subject matter well, and have some statistical training, we have difficulty dealing with situations that require statistical reasoning.  One research finding “illustrates a basic limitation in the ability of our mind to deal with small risks: we either ignore them altogether or give them far too much weight—nothing in between.” (p. 143)  “There is one thing you can do when you have doubts about the quality of the evidence: let your judgments of probability stay close to the base rate.” (p. 153)  “. . . whenever the correlation between two scores is imperfect, there will be regression to the mean. . . . [a process that] has an explanation but does not have a cause.” (pp. 181-82)

Finally, and the PC folks may not appreciate this, but “neglecting valid stereotypes inevitably results in suboptimal judgments.” (p. 169)

Part 3 focuses on specific shortcomings of our thought processes: overconfidence, fed by the illusory certainty of hindsight, in what we think we know, and underappreciation of the role of chance in events.

“Subjective confidence in a judgment is not a reasoned evaluation of the probability that this judgment is correct.  Confidence is a feeling.” (p. 212)  Hindsight bias “leads observers to assess the quality of a decision not by whether the process was sound but by whether its outcome was good or bad. . . . a clear outcome bias.” (p. 203)  “. . . the optimistic bias may well be the most significant of the cognitive biases.” (p. 255)  “The optimistic style involves taking credit for success but little blame for failure.” (p. 263)

“The sense-making machinery of System 1 makes us see the world as more tidy, predictable, and coherent than it really is.” (p. 204)  “. . . reality emerges from the interactions of many different agents and force, including blind luck, often producing large and unpredictable results.” (p. 220)  “An unbiased appreciation of uncertainty is a cornerstone of rationality—but it is not what people and organizations want. . . . Acting on pretended knowledge is often the preferred solution.” (p. 263)

And the best quote in the book: “Professional controversies bring out the worst in academics.” (p. 234)

Part 4 contrasts the rational people of economics with the more complex people of psychology, in other words, the Econs vs. the Humans.  Kahneman shows how prospect theory opened a door between the two disciplines and contributed to the start of the field of behavioral economics.

Economists adopted expected utility theory to prescribe how decisions should be made and describe how Econs make choices.  In contrast, prospect theory has three cognitive features: evaluation of choices is relative to a reference point, outcomes above that point are gains, below that point are losses; diminishing sensitivity to changes; and loss aversion, where losses loom larger than gains. (p. 282)  In practice, loss aversion leads to risk-averse choices when both gains and losses are possible, and diminishing sensitivity leads to risk taking when sure losses are compared to a possible larger loss.  “Decision makers tend to prefer the sure thing over the gamble (they are risk averse) when the outcomes are good.  They tend to reject the sure thing and accept the gamble (they are risk seeking) when both outcomes are negative.” (p. 368)

“The fundamental ideas of prospect theory are that reference points exist, and that losses loom larger than corresponding gains.” (p. 297)  “A reference point is sometimes the status quo, but it can also be a goal in the future; not achieving the goal is a loss, exceeding the goal is a gain.” (p. 303)  Loss aversion is a powerful conservative force.” (p. 305)

When people do consider vary rare events, e.g., a nuclear accident, they will almost certainly overweight the probability in their decision making.  “ . . . people are almost completely insensitive to variations of risk among small probabilities.” (p. 316)  “. . . low-probability events are much more heavily weighted when described in terms of relative frequencies (how many) than when stated is more abstract terms of . . . “probability” (how likely).” (p. 329)  Framing of questions evoke emotions, e.g., “losses evokes stronger negative feelings than costs.” (p. 364)  But “[r]eframing is effortful and System 2 is normally lazy.” (p. 367)  As an exercise, think about how anti-nuclear activists and NEI would frame the same question about the probability and consequences of a major nuclear accident. 

There are some things an organization can do to improve its decision making.  It can use local centers of over optimism (Sales dept.) and loss aversion (Finance dept.) to offset each other.  In addition, an organization's decision making practices can require the use an outside view (i.e., a look at the probabilities of similar events in the larger world) and a formal risk policy to mitigate against known decision biases. (p. 340)

Part 5 covers two different selves that exist in every human, the experiencing self and the remembering self.  The former lives through an experience and the latter creates a memory of it (for possible later recovery) using specific heuristics.  Our tendency to remember events as a sample or summary of actual experience is a factor that biases current and future decisions.  We end up favoring (fearing) a short period of intense joy (pain) over a long period of moderate happiness (pain). (p. 409) 

Our memory has evolved to represent past events in terms of peak pain/pleasure during the events and our feelings when the event is over.  Event duration does not impact our ultimate memory of an event.  For example, we choose future vacations based on our final evaluations of past vacations even if many of our experiences during the past vacations were poor. (p. 389)

In a possibly more significant area, the life satisfaction score you assign to yourself is based on a small sample of highly available ideas or memories. (p. 400)  Ponder that the next time you take or review responses from a safety culture survey.

Our Perspective

This is an important book.  Although not explicitly stated, the great explanatory themes of cause (mechanical), choice (intentional) and chance (statistical) run through it.  It is filled with nuggets that apply to the individual (psychological) and also the aggregate if the group shares similar beliefs.  Many System 1 characteristics, if unchecked and shared by a group, have cultural implications.*** 

We have discussed Kahneman's work before on this blog, e.g., his view that an organization is a factory for producing decisions and his suggestion to use a “premortem” as a partial antidote for overconfidence.  (A premortem is an exercise the group undertakes before committing to an important decision: Imagine being a year into the future, the decision's outcome is a disaster.  What happened?)  For more on these points, see our Nov. 4, 2011 post.

We have also discussed some of the topics he raises, e.g., the hindsight bias.  Hindsight is 20/20 and it supposedly shows what decision makers could (and should) have known and done instead of their actual decisions that led to an unfavorable outcome, incident, accident or worse.  We now know that when the past was the present, things may not have been so clear-cut.

Kahneman's observation that the ability to control attention predicts on-the-job performance (p. 37) is certainly consistent with our reports on the characteristics of high reliability organizations (HROs). 

“The premise of this book is that it is easier to recognize other people's mistakes than our own.” (p. 28)  Having observers at important, stressful decision making meetings is useful; they are less cognitively involved than the main actors and more likely to see any problems in the answers being proposed.

Critics' major knock on Kahneman's research is that it doesn't reflect real world conditions.  His model is “overly concerned with failures and driven by artificial experiments than by the study of real people doing things that matter.” (p. 235)  He takes this on by collaborating with a critic in an investigation of intuitive decision making, specifically seeking to answer: “When can you trust a self-confident professional who claims to have an intuition?” (p. 239)  The answer is when the expert acquired skill in a predictable environment, and had sufficient practice with immediate, high-quality feedback.  For example, anesthesiologists are in a good position to develop predictive expertise; on the other hand, psychotherapists are not, primarily because a lot of time and external events can pass between their prognosis for a patient and ultimate results.  However, “System 1 takes over in emergencies . . .” (p. 35)  Because people tend to do what they've been trained to do in emergencies, training leading to (correct) responses is vital.

Another problem is that most of Kahneman's research uses university students, both undergraduate and graduate, as subjects.  It's fair to say professionals have more training and life experience, and have probably made some hasty decisions they later regretted and (maybe) learned from.  On the other hand, we often see people who make sub-optimal, or just plain bad decisions even though they should know better.

There are lessons here for managers and other would-be culture shapers.  System 1's search for answers is mostly constrained to information consistent with existing beliefs (p. 103) which is an entry point for  culture.  We have seen how group members can have their internal biases influenced by the dominant culture.  But to the extent System 1 dominates employees' decision making, decision quality may suffer.

Not all appeals can be made to the rational man in System 2 even though a customary, if tacit, assumption of managers is they and their employees are rational and always operating consciously, thus new experiences will lead to expected new values and beliefs, new decisions and improved safety culture.  But it may not be this straightforward.  System 1 may intervene and managers should be alert to evidence of System 1 type thinking and adjust their interventions accordingly.  Kahneman suggests encouraging “a culture in which people look out for one another as they approach minefields.” (p. 418) 

We should note Systems 1 and 2 are constructs and “do not really exist in the brain or anywhere else.” (p. 415)  System 1 is not Dr. Morbius' Id monster.****  System 1 can be trained to behave differently, but it is always ready to provide convenient answers for a lazy System 2.

The book is long, with small print, but the chapters are short so it's easy to invest 15-20 min. at a time.  One has to be on constant alert for useful nuggets that can pop up anywhere—which I guess promotes reader mindfulness.  It is better than Blink, which simply overwhelmed this reader with a cloudburst of data showing the informational value of thin slices and unintentionally over-promoted the value of intuition. (see pp. 235-36)  And it is much deeper than The Power of Habit, which we reviewed last February.

(Common sense is nothing more than a deposit of prejudices laid down by the mind before you reach eighteen.  Attributed to Albert Einstein)

*  D. Kahneman, Thinking, Fast and Slow (New York: Farrar, Straus and Giroux, 2011).

**  WYSIATI – What You See Is All There Is.  Information that is not retrieved from memory, or otherwise ignored, may as well not exist. (pp. 85-88)  WYSIATI means we base decisions on the limited information that we are able or willing to retrieve before a decision is due.  

***  A few of these characteristics are mentioned in this report, e.g., impressions morphing into beliefs, a bias to believe and confirm, and WYSIATI errors.  Others include links of cognitive ease to illusions of truth and reduced vigilance (complacency), and narrow framing where decision problems are isolated from one another. (p. 105)

****  Dr. Edward Morbius is a character in the 1956 sci-fi movie Forbidden Planet.

Monday, November 11, 2013

Engineering a Safer World: Systems Thinking Applied to Safety by Nancy Leveson

In this book* Leveson, an MIT professor, describes a comprehensive approach for designing and operating “safe” organizations based on systems theory.  The book presents the criticisms of traditional incident analysis methods, the principles of system dynamics, and essential safety-related organizational characteristics, including the role of culture, in one place; this review emphasizes those topics.  It should be noted the bulk of the book describes her accident causality model and how to apply it, including extensive case studies; this review does not fully address that material.

Part I
     
Part I sets the stage for a new safety paradigm.  Many contemporary socio-technical systems exhibit, among other characteristics, rapidly changing technology, increasing complexity and coupling, and pressures that put production ahead of safety. (pp. 3-6)   Traditional accident analysis techniques are no longer sufficient.  They too often focus on eliminating failures, esp. component failures or “human error,” instead of concentrating on eliminating hazards. (p. 10)  Some of Leveson's critique of traditional accident analysis echoes Dekker (esp. the shortcomings of Newtonian-Cartesian analysis, reviewed here).**   We devote space to Leveson's criticisms because she provides a legitimate perspective on techniques that comprise some of the nuclear industry's sacred cows.

Event-based models are simply inadequate.  There is subjectivity in selecting both the initiating event (the failure) and the causal chains backwards from it.  The root cause analysis often stops at the first root cause that is familiar, amenable to corrective action, difficult to get beyond (usually the human operator or other human role) or politically acceptable. (pp. 20-24)  Reason's Swiss cheese model is insufficient because of its assumption of direct, linear relationships between components. (pp. 17-19)  In addition, “event-based models are poor at representing systemic accident factors such as structural deficiencies in the organization, management decision making, and flaws in the safety culture of the company or industry.” (p. 28)

Probabilistic Risk Assessment (PRA) studies specified failure modes in ever greater detail but ignores systemic factors.  “Most accidents in well-designed systems involve two or more low-probability events occurring in the worst possible combination.  When people attempt to predict system risk, they explicitly or implicitly multiply events with low probability—assuming independence—and come out with impossibly small numbers, when, in fact, the events are dependent.  This dependence may be related to common systemic factors that do not appear in an event chain.  Machol calls this phenomenon the Titanic coincidence . . . The most dangerous result of using PRA arises from considering only immediate physical failures.” (pp. 34-35)  “. . . current [PRA] methods . . . are not appropriate for systems controlled by software and by humans making cognitively complex decisions, and there is no effective way to incorporate management or organizational factors, such as flaws in the safety culture, . . .” (p. 36) 

The search for operator error (a fall guy who takes the heat off of system designers and managers) and hindsight bias also contribute to the inadequacy of current accident analysis approaches. (p. 38)  In contrast to looking for an individual's “bad” decision, Leveson says “the study of decision making cannot be separated from a simultaneous study of the social context, the value system in which it takes place, and the dynamic work process it is intended to control.” (p. 46) 

Leveson says “Systems are not static. . . . they tend to involve a migration to a state of increasing risk over time.” (p. 51)  Causes include adaptation in response to pressures and the effects of multiple independent decisions. (p. 52)  This is reminiscent of  Hollnagel's warning that cost pressure will eventually push production to the edge of the safety boundary.

When accidents or incidents occur, Leveson proposes that analysis should search for reasons (the Whys) rather than blame (usually defined as Who) and be based on systems theory. (pp. 55-56)  In a systems view, safety is an emergent property, i.e., system safety performance cannot be predicted by analyzing system components. (p. 64)  Some of the goals for a better model include analysis that goes beyond component failures and human errors, is more scientific and less subjective, includes the possibility of system design errors and dysfunctional system interactions, addresses software, focuses on mechanisms and factors that shape human behavior, examines processes and allows for multiple viewpoints in the incident analysis. (pp. 58-60) 

Part II

Part II describes Leveson's proposed accident causality model based on systems theory: STAMP (Systems-Theoretic Accident Model and Processes).  For our purposes we don't need to spend much space on this material.  “The model includes software, organizations, management, human decision-making, and migration of systems over time to states of heightened risk.”***   It attempts to achieve the goals listed at the end of Part I.

STAMP treats safety in a system as a control problem, not a reliability one.  Specifically, the overarching goal “is to control the behavior of the system by enforcing the safety constraints in its design and operation.” (p. 76)  Controls may be physical or social, including cultural.  There is a good discussion of the hierarchy of control in a complex system and the impact of possible system dynamics, e.g., time lags, feedback loops and changes in control structures. (pp. 80-87)  “The process leading up to an accident is described in STAMP in terms of an adaptive feedback function that fails to maintain safety as system performance changes over time to meet a complex set of goals and values.” (p. 90)

Leveson describes problems that can arise from an inaccurate mental model of a system or an inaccurate model displayed by a system.  There is a lengthy, detailed case study that uses STAMP to analyze a tragic incident, in this case a friendly fire accident where a U.S. Army helicopter was shot down by an Air Force plane over Iraq in 1994.

Part III

Part III describes in detail how STAMP can be applied.  There are many useful observations (e.g., problems with mode confusion on pp. 289-94) and detailed examples throughout this section.  Chapter 11 on using a STAMP-based accident analysis illustrates the claimed advantages of  STAMP over traditional accident analysis techniques. 

We will focus on a chapter 13, “Managing Safety and the Safety Culture,” which covers the multiple dimensions of safety management, including safety culture.

Leveson's list of the components of effective safety management is mostly familiar: management commitment and leadership, safety policy, communication, strong safety culture, safety information system, continual learning, education and training. (p. 421)  Two new components need a bit of explanation, a safety control structure and controls on system migration toward higher risk.  The safety control structure assigns specific safety-related responsibilities to management, system designers and operators. (pp. 436-40)  One of the control structure's responsibilities is to identify “the potential reasons for and types of migration toward higher risk need to be identified and controls instituted to prevent it.” (pp. 425-26)  Such an approach should be based on the organization's comprehensive hazards analysis.****

The safety culture discussion is also familiar. (pp. 426-33)  Leveson refers to the Schein model, discusses management's responsibility for establishing the values to be used in decision making, the need for open, non-judgmental communications, the freedom to raise safety questions without fear of reprisal and widespread trust.  In such a culture, Leveson says an early warning system for migration toward states of high risk can be established.  A section on Just Culture is taken directly from Dekker's work.  The risk of complacency, caused by inaccurate risk perception after a long history of success, is highlighted.

Although these management and safety culture contents are generally familiar, what's new is relating them to systems concepts such as control loops and feedback and taking a systems view of the safety control system.

Our Perspective
 

Overall, we like this book.  It is Leveson's magnum opus, 500+ pages of theory, rationale, explanation, examples and infomercial.  The emphasis on the need for a systems perspective and a search for Why accidents/incidents occur (as opposed to What happened or Who is at fault) is consistent with what we've been saying on this blog.  The book explains and supports many of the beliefs we have been promoting on Safetymatters: the shortcomings of traditional (but commonly used) methods of incident investigation; the central role of decision making; and how management commitment, financial and non-financial rewards, and a strong safety culture contribute to system safety performance.
 

However, there are only a few direct references to nuclear.  The examples in the book are mostly from aerospace, aviation, maritime activities and the military.  Establishing a safety control structure is probably easier to accomplish in a new aerospace project than in an existing nuclear organization with a long history (aka memory),  shifting external pressures, and deliberate incremental changes to hardware, software, policies, procedures and programs.  Leveson does mention John Carroll's (her MIT colleague) work at Millstone. (p. 428)  She praises nuclear LER reporting as a mechanism for sharing and learning across the industry. (pp. 406-7)  In our view, LERs should be helpful but they are short on looking at why incidents occur, i.e., most LER analysis does not look at incidents from a systems perspective.  TMI is used to illustrate specific system design/operation problems.
 

We don't agree with the pot shots Leveson takes at High Reliability Organization (HRO) theorists.  First, she accuses HRO of confusing reliability with safety, in other words, an unsafe system can function very reliably. (pp. 7, 12)  But I'm not aware of any HRO work that has been done in an organization that is patently unsafe.  HRO asserts that reliability follows from practices that recognize and contain emerging problems.  She takes another swipe at HRO when she says HRO suggests that, during crises, decision making migrates to frontline workers.  Leveson's problem with that is “the assumption that frontline workers will have the necessary knowledge and judgment to make decisions is not necessarily true.” (p. 44)  Her position may be correct in some cases but as we saw in our review of CAISO, when the system was veering off into new territory, no one had the necessary knowledge and it was up to the operators to cope as best they could.  Finally, she criticizes HRO advice for operators to be on the lookout for “weak signals.”  In her view, “Telling managers and operators to be “mindful of weak signals” simply creates a pretext for blame after a loss event occurs.” (p. 410)  I don't think it's pretext but it is challenging to maintain mindfulness and sense faint signals.  Overall, this appears to be academic posturing and feather fluffing.
 

We offer no opinion on the efficacy of using Leveson's STAMP approach.  She is quick to point out a very real problem in getting organizations to use STAMP: its lack of focus on finding someone/something to blame means it does not help identify subjects for discipline, lawsuits or criminal charges. (p. 86)
 

In Leveson's words, “The book is written for the sophisticated practitioner . . .” (p. xviii)  You don't need to run out and buy this book unless you have a deep interest in accident/incident analysis and/or are willing to invest the time required to determine exactly how STAMP might be applied in your organization.


*  N.G. Leveson, Engineering a Safer World: Systems Thinking Applied to Safety (The MIT Press, Cambridge, MA: 2011)  The link goes to a page where a free pdf version of the book can be downloaded; the pdf cannot be copied or printed.  All quotes in this post were retyped from the original text.


**  We're not saying Dekker or Hollnagel developed their analytic viewpoints ahead of Leveson; we simply reviewed their work earlier.  These authors are all aware of others' publications and contributions.  Leveson includes Dekker in her Acknowledgments and draws from Just Culture: Balancing Safety and Accountability in her text. 

***  Nancy Leveson informal bio page.


****  “A hazard is a system state or set of conditions that, together with a particular set of worst-case environmental conditions, will lead to an accident.” (p. 157)  The hazards analysis identifies all major hazards the system may confront.  Baseline safety requirements follow from the hazards analysis.  Responsibilities are assigned to the safety control structure for ensuring baseline requirements are not violated while allowing changes that do not raise risk.  The identification of system safety constraints allows the possibility of identifying leading indicators for a specific system. (pp. 337-38)