Showing posts with label Perin. Show all posts
Showing posts with label Perin. Show all posts

Thursday, January 15, 2015

Back to the Past at Millstone?


Millstone

A recent article* in the Hartford Courant newspaper reported on a turbine-driven auxiliary feedwater (TDAFW) pump problem at Millstone 3 that took so long to resolve that the NRC issued a White finding to plant owner Dominion Resources.

The article included a quote from the Connecticut Nuclear Energy Advisory Council (NEAC) describing their unease over the pump problem.  We dug a little deeper on the NEAC, a state government entity that works with public agencies and plant operators to ensure public health and safety.  Their 2014 annual report** highlights the TDAFW pump problem and another significant event at Millstone, a loss of site power that caused a dual reactor trip.  NRC inspections following these two events resulted in one Severity Level III finding, the White finding previously mentioned and two Green findings.  The events and NRC findings led the NEAC to express “great concern regarding the downward performance trend” to Dominion and request a formal response from Millstone management on any root cause that linked the performance problems.

In his response to the NEAC, the Millstone site VP said there was no root cause linking events.  He also said two safety culture (SC) improvement areas had been identified, viz., problem identification and evaluation and establishing clarity around decision making, and that the site has implemented improvement actions to address those areas.  In the Courant article, a plant spokesman is quoted as saying "If it's not immediately obvious why it's not working, we put a team to work on it."

The article also referred to related behind-the-scenes NRC staff emails*** in which the time it took for Dominion to identify and address the TDAFW pump issue raised eyebrows at the NRC.

So what does the TDAFW pump event tell us about SC at Millstone?

Our Perspective

Is Millstone on the road to the bad old days, when SC was AWOL from the site?  We hope not.  And there is some evidence that suggests the TDAFW pump issue was an isolated problem exacerbated by a bit of bad luck (a vendor supplying the wrong part with the same part number as the correct part).

Positive data includes the following: Millstone 2 and 3 both had all green performance indicators on the 3QTR2014 NRC ROP and, more importantly, a mid-2014 baseline inspection of the Millstone CAP “concluded that Dominion was generally effective in identifying, evaluating, and resolving problems.”****  In addition, plant “staff expressed a willingness to use the corrective action program to identify plant issues and deficiencies and stated that they were willing to raise safety issues.” (p. 10)

Currently, M2 is subject to baseline inspection and M3 to baseline and a supplemental inspection because of the White finding.

To us, this doesn’t look like a plant on the road to SC hell although we agree with the NRC that the TDAFW pump problem took too long to evaluate and resolve.

We hope the Millstone organization learned more from the TDAFW pump problem than they displayed in their reply to the NRC.*****  In dealing with the regulator, Millstone naturally tried to bound the problem and their response: they pointed at the vendor for sending them the wrong part, implemented a TDAFW pump troubleshooting guide, revised a troubleshooting procedure, and produced and presented two case studies to applicable plant personnel. 

The site VP’s letter to NEAC suggests a broader application of the lessons learned.  We suggest the “trust but verify” principle for dealing with vendors be strengthened and that someone be assigned to read Constance Perin’s Shouldering Risk (see our Sept. 12, 2011 review) and report back on the ways factors such as accepted logics, organizational power relations and production pressure can prevent organizations from correctly perceiving problems that are right in front of them.


*  S. Singer, “Emails Show NRC's Concern Over How Millstone Nuclear Plant Reacted To Malfunction,” Hartford Courant (Jan. 12, 2015).

**  2014 Nuclear Energy Advisory Council (NEAC) Report (Dec. 11, 2014).  The Nov. 10, 2014 letter from Millstone site VP S.E. Scace to J.W. Sheehan (NEAC) is appended to NEAC’s 2014 annual report.

***  The Associated Press obtained the emails under a Freedom of Information Act request.  Most of the content relates to the evolution of technical issues but, as cited in the Courant article, there are mentions of Millstone’s slowness in dealing with the pump issue.  The emails are available at ADAMS ML14358A318 and ML14358A320.


*****  Dominion Nuclear Connecticut, Inc. Millstone Power Station Unit 3, Reply to a Notice of Violation (Nov. 19, 2014).  ADAMS ML14325A060.

Thursday, August 7, 2014

1995 ANS Safety Culture Conference: A Portal to the Past

In April 1995 the American Nuclear Society (ANS) sponsored a nuclear safety culture (SC) conference in Vienna.  This was a large undertaking, with over 80 presentations; the proceedings are almost 900 pages in length.*  Presenters included industry participants, regulators, academics and consultants.  1995 was early in the post-Soviet era and the new openness (and concerns about Soviet reactors) led to a large number of presenters from Russia, Ukraine and Eastern Europe.  This post presents some conference highlights on topics we emphasize on Safetymatters.

Decision Making

For us, decision making should be systemic, i.e., consider all relevant inputs and the myriad ways a decision can affect consequences.  The same rigor should be applied to all kinds of decisions—finance, design, operations, resource allocation, personnel, etc.  Safety should always have the highest priority and decisions should accord safety its appropriate consideration.  Some presenters echoed this view.

“Safety was (and still is) seen as being vital to the success of the industry and hence the analysis and assessment of safety became an integral part of management decision making” (p. 41); “. . . in daily practice: overriding priority to safety is the principle, to be taken into account before making any decision” (p. 66); and “The complexity of operations implies a systemic decision process.” (p. 227)

The relationship between leadership and decisions was mentioned.  “The line management are a very important area, as they must . . . realise how their own actions and decisions affect Safety Culture.  The wrong actions, or perceived messages could undermine the work of the team leaders” (p. 186); “. . . statements alone do not constitute support; in the intermediate and long-term, true support is demonstrated by behavior and decision and not by what is said.” (p. 732)

Risk was recognized as a factor in decision making.  “Risk culture yields insights that permit balanced safety vs.cost decisions to be made” (p. 325); “Rational decision making is based on facts, experience, cognitive (mental) models and expected outcomes giving due consideration to uncertainties in the foregoing and the generally probabilistic nature of technical and human matters.  Conservative decision making is rational decision making that is risk-averse.  A conservative decision is weighted in favor of risk control at the expense of cost.” (p. 435)

In sum, nuclear thought leaders knew what good decision making should look like—but we still see cases that do not live up to that standard.

Rewards

Rewards or compensation were mentioned by people from nuclear operating organizations.  Incentive-based compensation was included as a key aspect of the TEPCO management approach (p. 551) and a nuclear lab manager recommended using monetary compensation to encourage cooperation between organizational departments. (p. 643)  A presenter from a power plant said “A recognition scheme is in place . . . to recognise and reward individuals and teams for their contribution towards quality improvement and nuclear safety enhancement.” (p. 805)

Rewards were also mentioned by several presenters who did not come from power plants.  For example, the reward system should stress safety (p. 322); rewards should be given for exhibiting a “caring attitude” about SC (p. 348) and to people who call attention to safety problems. (p. 527)  On the flip side, a regulator complained about plants that rewarded behavior that might cause safety to erode. (pp. 651, 656) 

Even in 1995 the presentations could have been stronger since INSAG-4** is so clear on the topic: “Importantly, at operating plants, systems of reward do not encourage high plant output levels if this prejudices safety.  Incentives are therefore not based on production levels alone but are also related to safety performance.” (INSAG-4, p. 11)  Today, our own research has shown that nuclear executives’ compensation often favors production.   

Systems Approach

We have always favored nuclear organizational mental models that consider feedback loops, time delays, adaptation, evolution and learning—a systems approach.  Presenters’ references to a system include “commercial, public, and military operators of complex high reliability socio-technical systems” (p. 260); “. . . assess the organisational, managerial and socio-technical influences on the Safety Culture of socio-technical systems such as nuclear power plants” (p. 308); “Within the complex system such as . . . [a] nuclear power plant there is a vast number of opportunities for failures to stay hidden in the system” (p. 541); and “It is proposed that the plant should be viewed as an integrated sociotechnical system . . .” (p. 541)

There are three system-related presentations that we suggest you read in their entirety; they have too many good points to summarize here.  One is by Electricité de France (EdF) personnel (pp. 193-201), another by Constance Perin (pp. 330-336) and a third by John Carroll (pp. 338-345). 

Here’s a sample, from Perin: “Through self-analysis, nuclear organizations can understand how they currently respond socially, culturally, and technically to such system characteristics of complexity, density, obscured signals, and delayed feedback in order to assure their capacities for anticipating, preventing, and recovering from threats to safety.” (p. 330)  It could have been written yesterday.

The Role of the Regulator

By 1995 INSAG-4 had been published and generally accepted by the nuclear community but countries were still trying to define the appropriate role for the regulator; the topic merited a half-dozen presentations.  Key points included the regulator (1) requiring that an effective SC be established, (2) establishing safety as a top-level goal and (3) performing some assessment of a licensee’ safety management system (either directly or part of ordinary inspection duties).  There was some uncertainty about how to proceed with compliance focus vs. qualitative assessment.

Today, at least two European countries are looking at detailed SC assessment, in effect, regulating SC.  In the U.S., the NRC issued a SC policy statement and performs back-door, de facto SC regulation through the “bring me another rock” approach.

So conditions have changed in regulatory space, arguably for the better when the regulator limits its focus to truly safety-significant activities.

Our Perspective

In 1995, some (but not all) people held what we’d call a contemporary view of SC.  For example, “Safety culture constitutes a state of mind with regard to safety: the value we attribute to it, the priority we give it, the interest we show in it.  This state of mind determines attitudes and behavior.” (p. 495)

But some things have changed.  For example, several presentations mentioned SC surveys—their design, administration, analysis and implications.  We now (correctly) understand that SC surveys are a snapshot of safety climate and only one input into a competent SC assessment.

And some things did not turn out well.  For example, a TEPCO presentation said “the decision making process is governed by the philosophy of valuing harmony highly so that a conclusion preferred by all the members is chosen as far as possible when there are divided opinions.” (p. 583)  Apparently harmony was so valued that no one complained that Fukushima site protection was clearly inadequate and essential emergency equipment was exposed to grave hazards. 


*  A. Carnino and G. Weimann, ed., “Proceedings of the International Topical Meeting on Safety Culture in Nuclear Installations,” April 24-28, 1995 (Vienna: ANS Austria Local Section, 1995).  Thanks to Bill Mullins for unearthing this document.

**  International Nuclear Safety Advisory Group, “Safety Culture,” Safety Series No. 75-INSAG-4, (Vienna: IAEA, 1991). INSAG-4 included a definition of SC, a description of SC components, and illustrative evidence that the components exist in a specific organization.

Friday, May 3, 2013

High Reliability Organizations and Safety Culture

On February 10th, we posted about a report covering lessons for safety culture (SC) that can be gleaned from the social science literature. The report's authors judged that high reliability organization (HRO) literature provided a solid basis for linking individual and organizational assumptions with traits and practices that can affect safety performance. This post explores HRO characteristics and how they can influence SC.

Our source is Managing the Unexpected: Resilient Performance in an Age of Uncertainty* by Karl Weick and Kathleen Sutcliffe. Weick is a leading contemporary HRO scholar. This book is clearly written, with many pithy comments, so lots of quotations are included below to present the authors' views in their own words.

What makes an HRO different?

Many organizations work with risky technologies where the consequences of problems or errors can be catastrophic, use complex management systems and exist in demanding environments. But successful HROs approach their work with a different attitude and practices, an “ongoing mindfulness embedded in practices that enact alertness, broaden attention, reduce distractions, and forestall misleading simplifications.” (p. 3)

Mindfulness

An underlying assumption of HROs is “that gradual . . . development of unexpected events sends weak signals . . . along the way” (p. 63) so constant attention is required. Mindfulness means that “when people act, they are aware of context, of ways in which details differ . . . and of deviations from their expectations.” (p. 32) HROs “maintain continuing alertness to the unexpected in the face of pressure to take cognitive shortcuts.” (p. 19) Mindful organizations “notice the unexpected in the making, halt it or contain it, and restore system functioning.” (p. 21)

It takes a lot of energy to maintain mindfulness. As the authors warn us, “mindful processes unravel pretty fast.” (p. 106) Complacency and hubris are two omnipresent dangers. “Success narrows perceptions, . . . breeds overconfidence . . . and reduces acceptance of opposing points of view. . . . [If] people assume that success demonstrates competence, they are more likely to drift into complacency, . . .” (p. 52) Pressure in the task environment is another potential problem. “As pressure increases, people are more likely to search for confirming information and to ignore information that is inconsistent with their expectations.” (p. 26) The opposite of mindfulness is mindlessness. “Instances of mindlessness occur when people confront weak stimuli, powerful expectations, and strong desires to see what they expect to see.” (p. 88)

Mindfulness can lead to insight and knowledge. “In that brief interval between surprise and successful normalizing lies one of your few opportunities to discover what you don't know.” (p. 31)**

Five principles

HROs follow five principles. The first three cover anticipation of problems and the remaining two cover containment of problems that do arise.

Preoccupation with failure

HROs “treat any lapse as a symptom that something may be wrong with the system, something that could have severe consequences if several separate small errors happened to coincide. . . . they are wary of the potential liabilities of success, including complacency, the temptation to reduce margins of safety, and the drift into automatic processing.” (p. 9)

Managers usually think surprises are bad, evidence of bad planning. However, “Feelings of surprise are diagnostic because they are a solid cue that one's model of the world is flawed.” (p. 104) HROs “Interpret a near miss as danger in the guise of safety rather than safety in the guise of danger. . . . No news is bad news. All news is good news, because it means that the system is responding.” (p. 152)

People in HROs “have a good sense of what needs to go right and a clearer understanding of the factors that might signal that things are unraveling.” (p. 86)

Reluctance to simplify

HROs “welcome diverse experience, skepticism toward received wisdom, and negotiating tactics that reconcile differences of opinion without destroying the nuances that diverse people detect. . . . [They worry that] superficial similarities between the present and the past mask deeper differences that could prove fatal.” (p. 10) “Skepticism thus counteracts complacency . . . .” (p. 155) “Unfortunately, diverse views tend to be disproportionately distributed toward the bottom of the organization, . . .” (p. 95)

The language people use at work can be a catalyst for simplification. A person may initially perceive something different in the environment but using familiar or standard terms to communicate the experience can raise the risk of losing the early warnings the person perceived.

Sensitivity to operations

HROs “are attentive to the front line, . . . Anomalies are noticed while they are still tractable and can still be isolated . . . . People who refuse to speak up out of fear undermine the system, which knows less than it needs to know to work effectively.” (pp. 12-13) “Being sensitive to operations is a unique way to correct failures of foresight.” (p. 97)

In our experience, nuclear plants are generally good in this regard; most include a focus on operations among their critical success factors.

Commitment to resilience

“HROs develop capabilities to detect, contain, and bounce back from those inevitable errors that are part of an indeterminate world.” (p. 14) “. . . environments that HROs face are typically more complex than the HRO systems themselves. Reliability and resilience lie in practices that reduce . . . environmental complexity or increase system complexity.” (p. 113) Because it's difficult or impossible to reduce environmental complexity, the organization needs to makes its systems more complex.*** This requires clear thinking and insightful analysis. Unfortunately, actual organizational response to disturbances can fall short. “. . . systems often respond to a disturbance with new rules and new prohibitions designed to present the same disruption from happening in the future. This response reduces flexibility to deal with subsequent unpredictable changes.” (p. 72)

Deference to expertise.

“Decisions are made on the front line, and authority migrates to the people with the most expertise, regardless of their rank.” (p. 15) Application of expertise “emerges from a collective, cultural belief that the necessary capabilities lie somewhere in the system and that migrating problems [down or up] will find them.” (p. 80) “When tasks are highly interdependent and time is compressed, decisions migrate down . . . Decisions migrate up when events are unique, have potential for very serious consequences, or have political or career ramifications . . .” (p. 100)

This is another ideal that can fail in practice. We've all seen decisions made by the highest ranking person rather than the most qualified one. In other words, “who is right” can trump “what is right.”

Relationship to safety culture

Much of the chapter on culture is based on the ideas of Schein and Reason so we'll focus on key points emphasized by Weick and Sutcliffe. In their view, “culture is something an organization has [practices and controls] that eventually becomes something an organization is [beliefs, attitudes, values].” (p. 114, emphasis added)

“Culture consists of characteristic ways of knowing and sensemaking. . . . Culture is about practices—practices of expecting, managing disconfirmations, sensemaking, learning, and recovering.” (pp. 119-120) A single organization can have different types of culture: an integrative culture that everyone shares, differentiated cultures that are particular to sub-groups and fragmented cultures that describe individuals who don't fit into the first two types. Multiple cultures support the development of more varied responses to nascent problems.

A complete culture strives to be mindful, safe and informed with an emphasis on wariness. As HRO principles are ingrained in an organization, they become part of the culture. The goal is a strong SC that reinforces concern about the unexpected, is open to questions and reporting of failures, views close calls as a failure, is fearful of complacency, resists simplifications, values diversity of opinions and focuses on imperfections in operations.

What else is in the book?

One chapter contains a series of audits (presented as survey questions) to assess an organization's mindfulness and appreciation of the five principles. The audits can show an organization's attitudes and capabilities relative to HROs and relative to its own self-image and goals.

The final chapter describes possible “small wins” a change agent (often an individual) can attempt to achieve in an effort to move his organization more in line with HRO practices, viz., mindfulness and the five principles. For example, “take your team to the actual site where an unexpected event was handled either well or poorly, walk everyone through the decision making that was involved, and reflect on how to handle that event more mindfully.” (p. 144)

The book's case studies include an aircraft carrier, a nuclear power plant,**** a pediatric surgery center and wildland firefighting.

Our perspective

Weick and Sutcliffe draw on the work of many other scholars, including Constance Perin, Charles Perrow, James Reason and Diane Vaughan, all of whom we have discussed in this blog. The book makes many good points. For example, the prescription for mindfulness and the five principles can contribute to an effective context for decision making although it does not comprise a complete management system. The authors' recognize that reliability does not mean a complete lack of performance variation, instead reliability follows from practices that recognize and contain emerging problems. Finally, there is evidence of a systems view, which we espouse, when the authors say “It is this network of relationships taken together—not necessarily any one individual or organization in the group—that can also maintain the big picture of operations . . .” (p. 142)

The authors would have us focus on nascent problems in operations, which is obviously necessary. But another important question is what are the faint signals that the SC is developing problems? What are the precursors to the obvious signs, like increasing backlogs of safety-related work? Could that “human error” that recently occurred be a sign of a SC that is more forgiving of growing organizational mindlessness?

Bottom line: Safetymatters says check out Managing the Unexpected and consider adding it to your library.


* K.E. Weick and K.M. Sutcliffe, Managing the Unexpected: Resilient Performance in an Age of Uncertainty, 2d ed. (San Francisco, CA: Jossey-Bass, 2007). Also, Wikipedia has a very readable summary of HRO history and characteristics.

** More on normalization and rationalization: “On the actual day of battle naked truths may be picked up for the asking. But by the following morning they have already begun to get into their uniforms.” E.A. Cohen and J. Gooch, Military Misfortunes: The Anatomy of Failure in War (New York: Vintage Books, 1990), p. 44, quoted in Managing the Unexpected, p. 31.

*** The prescription to increase system complexity to match the environment is based on the system design principle of requisite variety which means “if you want to cope successfully with a wide variety of inputs, you need a wide variety of responses.” (p. 113)

**** I don't think the authors performed any original research on nuclear plants. But the studies they reviewed led them to conclude that “The primary threat to operations in nuclear plants is the engineering culture, which places a higher value on knowledge that is quantitative, measurable, hard, objective, and formal . . . HROs refuse to draw a hard line between knowledge that is quantitative and knowledge that is qualitative.” (p. 60)

Friday, November 4, 2011

A Factory for Producing Decisions

The subject of this post is the compelling insights of Daniel Kahneman into issues of behavioral economics and how we think and make decisions.  Kahneman is one of the most influential thinkers of our time and a Nobel laureate.  Two links are provided for our readers who would like additional information.  One is via the McKinsey Quarterly, a video interview* done several years ago.  It runs about 17 minutes.  The second is a current review in The Atlantic** of Kahneman’s just released book, Thinking Fast and Slow.

Kahneman begins the McKinsey interview by suggesting that we think of organizations as “factories for producing decisions” and therefore, think of decisions as a product.  This seems to make a lot of sense when applied to nuclear operating organizations - they are the veritable “River Rouge” of decision factories.  What may be unusual for nuclear organizations is the large percentage of decisions that directly or indirectly include safety dimensions, dimensions that can be uncertain and/or significantly judgmental, and which often conflict with other business goals.  So nuclear organizations have to deliver two products: competitively priced megawatts and decisions that preserve adequate safety.

To Kahneman decisions as product logically raises the issue of quality control as a means to ensure the quality of decisions.  At one level quality control might focus on mistakes and ensuring that decisions avoid recurrence of mistakes.  But Kahneman sees the quality function going further into the psychology of the decision process to ensure, e.g., that the best information is available to decision makers, that the talents of the group surrounding the ultimate decision maker are being used effectively, and the presence of an unbiased decision-making environment.

He notes that there is an enormous amount of resistance within organizations to improving decision processes. People naturally feel threatened if their decisions are questioned or second guessed.  So it may be very difficult or even impossible to improve the quality of decisions if the leadership is threatened too much.  But, are there ways to avoid this?  Kahneman suggests the “premortem” (think of it as the analog to a post mortem).  When a decision is being formulated (not yet made), convene a group meeting with the following premise: It is a year from now, we have implemented the decision under consideration, it has been a complete disaster.  Have each individual write down “what happened?”

The objective of the premortem is to legitimize dissent and minimize the innate “bias toward optimism” in decision analysis.  It is based on the observation that as organizations converge toward a decision, dissent becomes progressively more difficult and costly and people who warn or dissent can be viewed as disloyal.  The premortem essentially sets up a competitive situation to see who can come up with the flaw in the plan.  In essence everyone takes on the role of dissenter.  Kahneman’s belief is that the process will yield some new insights - that may not change the decision but will lead to adjustments to make the decision more robust. 

Kahneman’s ideas about decisions resonate with our thinking that the most useful focus for nuclear safety culture is the quality of organizational decisions.  It also contrasts with a recent instance of a nuclear plant run afoul of the NRC (Browns Ferry) and now tagged with a degraded cornerstone and increased inspections.  As usual in the nuclear industry, TVA has called on an outside contractor to come in and perform a safety culture survey, to “... find out if people feel empowered to raise safety concerns….”***  It may be interesting to see how people feel, but we believe it would be far more powerful and useful to analyze a significant sample of recent organizational decisions to determine if the decisions reflect an appropriate level of concern for safety.  Feelings (perceptions) are not a substitute for what is actually occurring in the decision process. 

We have been working to develop ways to grade whether decisions support strong safety culture, including offering opportunities on this blog for readers to “score” actual plant decisions.  In addition we have highlighted the work of Constance Perin including her book, Shouldering Risks, which reveals the value of dissecting decision mechanics.  Perin’s observations about group and individual status and credibility and their implications for dissent and information sharing directly parallel Kahneman’s focus on the need to legitimize dissent.  We hope some of this thinking ultimately overcomes the current bias in nuclear organizations to reflexively turn to surveys and the inevitable retraining in safety culture principles.


*  "Daniel Kahneman on behavioral economics," McKinsey Quarterly video interview (May 2008).

** M. Popova, "The Anti-Gladwell: Kahneman's New Way to Think About Thinking," The Atlantic website (Nov. 1, 2011).

*** A. Smith, "Nuke plant inspections proceeding as planned," Athens [Ala.] News Courier website (Nov. 2, 2011).

Monday, September 12, 2011

Understanding the Risks in Managing Risks

Our recent blog posts have discussed the work of anthropologist Constance Perin.  This post looks at her book, Shouldering Risks: The Culture of Control in the Nuclear Power Industry.*  The book presents four lengthy case studies of incidents at three nuclear power plants and Perin’s analysis which aims to explain the cultural attributes that facilitated the incidents’ occurrence or their unfavorable evolution.

Because they fit nicely with our interest in decision-making, this post will focus on the two case studies that concerned hardware issues.**  The first case involved a leaking, unisolable valve in the reactor coolant system (RCS) that needed repacking, a routine job.  The mechanics put the valve on its backseat, opened it, observed the packing moving up (indicating that the water pressure was too high or the backseat step hadn't worked), and closed it up.  After management meetings to review the situation, the mechanics tried again, packing came out, and the leak became more serious.  The valve stem and disc had separated, a fact that was belatedly recognized.  The leak was eventually sufficiently controlled so the plant could wait until the next outage to repair/replace the valve.  

The second case involved a switchyard transformer that exhibited a hot spot during a thermography examination.  Managers initially thought they had a circulating current issue, a common problem.  After additional investigations, including people climbing on ladders up alongside the transformer, a cover bolt was removed and the employee saw a glow inside the transformer, the result of a major short.  Transformers can, and have, exploded from such thermal stresses but the plant was able to safely shut down to repair/replace the transformer.

In both cases, there was at least one individual who knew (or strongly suspected) that something more serious was wrong from the get-go but was unable to get the rest of the organization to accept a more serious, i.e., costly, diagnosis.

Why were the plant organizations so willing, even eager, to assume the more conventional explanations for the problems they were seeing?  Perin provides a multidimensional framework that helps answer that question.

The first dimension is the tradeoff quandary, the ubiquitous tension between production and cost, including costs associated with safety.  Plant organizations are expected to be making electricity, at a budgeted cost, and that subtle (or not-so-subtle) pressure colors the discussion of any problem.  There is usually a preference for a problem explanation and corrective action that allows the plant to continue running.

Three control logics constitute a second dimension.  The calculated logics are the theory of how a plant is (or should be) designed, built, and operated.  The real-time logics consist of the knowledge of how things actually work in practice.  Policy logics come from above, and represent generalized guidelines or rules for behavior, including decision-making.  An “answer” that comes from calculated or policy logic will be preferred over one that comes from real-time logic, partly because the former have been developed by higher-status groups and partly because such answers are more defensible to corporate bosses and regulators.

Finally, traditional notions of group and individual status and a key status property, credibility, populate a third dimension: design engineers over operators over system engineers over maintenance over others; managers over individual contributors; old-timers over newcomers.  Perin creates a construct of the various "orders"*** in a plant organization, specialists such as operators or system engineers.  Each order has its own worldview, values and logics – optimum conditions for nurturing organizational silos.  Information and work flows are mediated among different orders via plant-wide programs (themselves products of calculated and policy logics).
 
Application to Cases

The aforementioned considerations can be applied to the two cases.  Because the valve was part of the RCS, it should have been subject to more detailed planning, including additional risk analysis and contingency prep.  This was pointed out by a new-to-his-job work planner who was basically ignored because of his newcomer status.  And before the work was started, the system engineer (SE) observed that this type of valve (which had a problem history at this plant and elsewhere) was prone to valve disk/stem separation and this particular valve appeared to have the problem based on his visual inspection (it had one thread less visible than other similar valves).  But the SE did not make his observations forcefully and/or officially (by initiating a CR) so his (accurate) observation was not factored into the early decision-making.  Ultimately, their concerns did not sway the overall discussion where the schedule was highest priority.  A radiographic examination that would have shown the valve/disc separation was not performed early on because that was an Engineering responsibility and the valve repair was a Maintenance project.

The transformer is on the non-nuclear side of the plant, which makes the attitudes toward it less focused and critical than for safety-related equipment.  The hot spot was discovered by a tech who was working with a couple of thermography consultants.  Thermography was a relatively new technology at this plant and not well-understood by plant managers (or trusted because early applications had given false alarms).  The tech said that the patterns he observed were not typical for circulating currents but neither he nor the consultants (the three people on-site who understood thermography) were in the meetings where the problem was discussed.  The circulating current theory was popular because (a) the plant had experienced such problems in the past and (b) addressing it could be done without shutting down the plant.  Production pressure, the nature of past problems, and the lower status of roles and equipment that are not safety related all acted to suppress the emergent new knowledge of what the problem actually was.  

Lessons Learned

Perin’s analytic constructs are complicated and not light reading.  However, the interviews in the case studies are easy to read and very revealing.  It will come as no surprise to people with consulting backgrounds that the interviewees were capable of significant introspection.  In the harsh light of hindsight, lots of folks can see what should (and could) have happened.  

The big question is what did those organizations learn?  Will they make the same mistakes again?  Probably not.  But will they misinterpret future weak or ambiguous signals of a different nascent problem?  That’s still likely.  “Conventional wisdom” codified in various logics and orders and guided by a production imperative remains a strong force working against the open discussion of alternative explanations for new experiences, especially when problem information is incomplete or fuzzy.  As Bob Cudlin noted in his August 17, 2011 post: [When dealing with risk-imbued issues] “the intrinsic uncertainties in significance determination opens the door to the influence of other factors - namely those ever present considerations of cost, schedule, plant availability, and even more personal interests, such as incentive programs and career advancement.”

   
*  C. Perin, Shouldering Risks: The Culture of Control in the Nuclear Power Industry, (Princeton, NJ: Princeton University Press, 2005).

**  The case studies and Perin’s analysis have been greatly summarized for this blog post.

***  The “orders” include outsiders such as NRC, INPO or corporate overseers.  Although this may not be totally accurate, I picture orders as akin to medieval guilds.

Wednesday, August 17, 2011

Additional Thoughts on Significance Culture

Our previous post introduced the work of Constance Perin,  Visiting Scholar in Anthropology at MIT, including her thesis of “significance culture” in nuclear installations.  Here we expand on the intersection of her thesis with some of our work. 

Perin places primary emphasis on the availability and integration of information to systematize and enhance the determination of risk significance.  This becomes the true organizing principle of nuclear operational safety and supplants the often hazy construct of safety culture.  We agree with the emphasis on more rigorous and informed assessments of risk as an organizing principle and focus for the entire organization. 

Perin observes: “Significance culture arises out of a knowledge-using and knowledge-creating paradigm. Its effectiveness depends less on “management emphasis” and “personnel attitudes” than on having an operational philosophy represented in goals, policies, priorities, and actions organized around effectively characterizing questionable conditions before they can escalate risk.” (Significance Culture, p. 3)*

We found a similar thought from Kenneth Brawn on a recent LinkedIn post under the Nuclear Safety Group.  He states, “Decision making, and hence leadership, is based on accurate data collection that is orchestrated, focused, real time and presented in a structured fashion for a defined audience….Managers make decisions based on stakeholder needs – the problem is that risk is not adequately considered because not enough time is taken (given) to gather and orchestrate the necessary data to provide structured information for the real time circumstances.” ** 

While seeing the potential unifying force of significance culture, we are mindful also that such determinations often are made under a cloak of precision that is not warranted or routinely achievable.  Such analyses are complex, uncertain, and subject to considerable judgment by the involved analysts and decision makers.  In other words, they are inherently fuzzy.  This limitation can only be partly remedied through better availability of information.  Nuclear safety does not generally include “bright lines” of acceptable or unacceptable risks, or finely drawn increments of risk.  Sure, PRA analyses and other “risk informed” approaches provide the illusion of quantitative precision, and often provide useful insight for devising courses of action that that do not pose “undue risk” to public safety.  But one does not have to read too many Licensee Event Reports (LERs) to see that risk determinations are ultimately shades of gray.  For one example, see the background information on our decision scoring example involving a pipe leak in a 30” moderate energy piping elbow and interim repair.  The technical justification for the interim fix included terms such as “postulated”, “best estimate” and “based on the assumption”.  A full reading of the LER makes clear the risk determination involved considerable qualitative judgment by the licensee in making its case and the NRC in approving the interim measure. That said, the NRC’s justification also rested in large part on a finding of “hardship or unusual difficulty” if a code repair were to be required immediately.

Where is this leading us?  Are poor safety decisions the result of the lack of quality information?  Perhaps.  However another scenario that is at least equally likely, is that the appropriate risk information may not be pursued vigorously or the information may be interpreted in the light most favorable to the organization’s other priorities.  We believe that the intrinsic uncertainties in significance determination opens the door to the influence of other factors - namely those ever present considerations of cost, schedule, plant availability, and even more personal interests, such as incentive programs and career advancement.  Where significance is fuzzy, it invites rationalization in the determination of risk and marginalization of the intrinsic uncertainties.  Thus a desired decision outcome could encourage tailoring of the risk determination to achieve the appropriate fit.  It may mean that Perin’s focus on “effectively characterizing questionable conditions” must also account for the presence and potential influence of other non-safety factors as part of the knowledge paradigm.   

This brings us back to Perin’s ideas for how to pull the string and dig deeper into this subject.  She finds, “Condition reports and event reviews document not only material issues. Uniquely, they also document systemic interactions among people, priorities, and equipment — feedback not otherwise available.” (Significance Culture, p.5)  This emphasis makes a lot of sense and in her book, Shouldering Risks: The Culture of Control in the Nuclear Power Industry, she takes up the challenge of delving into the depths of a series of actual condition reports.  Stay tuned for our review of the book in a subsequent post.


*  C. Perin, “Significance Culture in Nuclear Installations,” a paper presented at the 2005 Annual Meeting of the American Nuclear Society (June 6, 2005).

**  You may be asked to join the LinkedIn Nuclear Safety group to view Mr. Brawn's comment and the discussion of which it is part.

Friday, August 12, 2011

An Anthropologist’s View

Academics in many disciplines study safety culture.  This post introduces to this blog the work of an MIT anthropologist, Constance Perin, and discusses a paper* she presented at the 2005 ANS annual meeting.

We picked a couple of the paper’s key recommendations to share with you.  First, Perin’s main point is to advocate the development of a “significance culture” in nuclear power plant organizations.  The idea is to organize knowledge and data in a manner that allows an organization to determine significance with respect to safety issues.  The objective is to increase an organization’s capabilities to recognize and evaluate questionable conditions before they can escalate risk.  We generally agree with this aim.  The real nub of safety culture effectiveness is how it shapes the way an organization responds to new or changing situations.

Perin understands that significance evaluation already occurs in both formal processes (e.g., NRC evaluations and PRAs) and in the more informal world of operational decisions, where trade-offs, negotiations, and satisficing behavior may be more dynamic and less likely to be completely rational.  She recommends that significance evaluation be ascribed a higher importance, i.e., be more formally and widely ingrained in the overall plant culture, and used as an organizing principle for defining knowledge-creating processes. 

Second, because of the importance of a plant's Corrective Action Program (CAP), Perin proposes making NRC assessment of the CAP the “eighth cornerstone” of the Reactor Oversight Process (ROP).  She criticizes the NRC’s categorization of cross cutting issues for not being subjected to specific criteria and performance indicators.  We have a somewhat different view.  Perin’s analysis does not acknowledge that the industry places great emphasis on each of the cross cutting issues in terms of performance indicators and monitoring including self assessment.**  It is also common to the other cornerstones where the plants use many more indicators to track and trend performance than the few included in the ROP.  In our opinion, a real problem with the ROP is that its few indicators do not provide any reliable or forward looking picture of nuclear safety. 

The fault line in the CAP itself may better be characterized in terms of the lack of measurement and assessment of how well the CAP program functions to sustain a strong safety culture.  Importantly such an approach would evaluate how decisions on conditions adverse to quality properly assessed not only significance, but balanced the influence of any competing priorities.  Perin also recognizes that competing priorities exist, especially in the operational world, but making the CAP a cornerstone might actually lead to increased false confidence in the CAP if its relationship with safety culture was left unexamined.

Prof. Perin has also written a book, Shouldering Risks: The Culture of Control in the Nuclear Power Industry,*** which is an ethnographic analysis of nuclear organizations and specific events they experienced.  We will be reviewing this book in a future post.  We hope that her detailed drill down on those events will yield some interesting insights, e.g., how different parts of an organization looked at the same situation but had differing evaluations of its risk implications.

We have to admit we didn’t detect Prof. Perin on our radar screen; she alerted us to the presence of her work.  Based on our limited review to date, we think we share similar perspectives on the challenges involved in attaining and maintaining a robust safety culture.


*  C. Perin, “Significance Culture in Nuclear Installations,” a paper presented at the 2005 Annual Meeting of the American Nuclear Society (June 6, 2005).

** The issue may be one of timing.  Prof. Perin based her CAP recommendation, in part, on a 2001 study that suggested licensees’ self-regulation might be inadequate.  We have the benefit of a more contemporary view.  

*** C. Perin, Shouldering Risks: The Culture of Control in the Nuclear Power Industry, (Princeton, NJ: Princeton University Press, 2005).