Sunday, February 10, 2013

Safety Culture - Lessons from the Social Science Literature

In 2011 the NRC contracted with the Pacific Northwest National Laboratory to conduct a review of social science literature related to safety culture (SC) and methods for evaluating interventions proposed to address issues identified during SC assessments.  The resultant report* describes how traits such as leadership, trust, respect, accountability, and continuous learning are discussed in the literature. 

The report is heavily academic but not impenetrable and a good reference work on organizational culture theory and research.  I stumbled on this report in ADAMS and don't know why it hasn't had wider distribution.  Perhaps it's seen as too complicated or, more importantly, doesn't exactly square with the NRC/NEI/industry Weltanschauung when the authors say things like:  

“There is no simple recipe for developing safety culture interventions or for assessing the likelihood that these interventions will have the desired effects.” (p. 2)

“The literature consistently emphasizes that effecting directed behavioral, cognitive, or cultural change in adults and within established organizations is challenging and difficult, requires persistence and energy, and is frequently unsuccessful.” (p. 7)

This report contains an extensive review of the literature and it is impossible to summarize in a blog post.  We'll provide an overview of the content, focusing on interesting quotes and highlights, then revisit Schein's model and close with our two cents worth.

Concept of safety culture

This section begins with the definition of SC and the nine associated traits in the NRC SC policy statement, and compares them with other organizations' (IAEA, NEI, DOE et al) efforts. 

The Schein model is proposed as a way to understand “why things are as they are” as a starting point upon which to build change strategies aimed at improving organizational performance.  An alternative approach is to define the characteristics of an ideal SC, then evaluate how much the target organization differs from the ideal, and use closing the gap as the objective for corrective strategies.  The NEI approach to SC assessment reflects the second conceptual model.  A third approach, said to bridge the difference between the first two, is proposed by holistic thinkers such as Reason who focus on overall organizational culture. 

This is not the usual “distinction without a difference” argument that academics often wage.  Schein's objective is to improve organizational performance; the idealists' objective is to make an organization correspond to the ideal model with an assumption that desired performance will follow. 

The authors eventually settle on the high reliability organization (HRO) literature as providing the best basis for linking individual and organizational assumptions with traits and mechanisms for affecting safety performance.  Why?  The authors say the HRO approach identifies some of the specific mechanisms that link elements of a culture to safety outcomes and identifies important relationships among the cultural elements. (p. 15)  A contrary explanation is that the authors wanted to finesse their observation that Schein (beloved by NRC) and NEI have different views of the the basis that should be used for designing SC improvement initiatives.

Building blocks of culture 


The authors review the “building blocks” of culture, highlighting areas that correspond to the NRC safety culture traits.  If an organization wants to change its culture, it needs to decide which building blocks to address and how to make and sustain changes.

Organizational characteristics that correspond to NRC SC traits include leadership, communication, work processes, and problem identification and resolution.  Leadership and communication are recognized as important in the literature and are discussed at length.  However, the literature review offered thin gruel in the areas of work processes, and problem identification and resolution; in other words, the connections between these traits and SC are not well-defined. (pp. 20-25)

There is an extensive discussion of other building blocks including perceptions, values, attitudes, norms**, beliefs, motivations, trust, accountability and respect.  Implications for SC assessment and interventions are described, where available.  Adaptive processes such as sense making and double-loop learning are also mentioned.

Change and change management

The authors review theories of individual and organizational change and change management.  They note that planned interventions need to consider other changes that may be occurring because of dynamic processes between the organization and its environment and within the organization itself.

Many different models for understanding and effecting organizational change are described.  As the authors summarize: “. . . change is variously seen as either pushed by problems or pulled by visions or goals; as purposive and volitional or inadvertent and emergent; as a one-time event or a continuous process. It is never seen as easy or simple.” (p. 43)

The authors favor Montaño and Kaspryzk’s Integrated Behavioral Model, shown in the figure below, as a template for designing and evaluating SC interventions.  It's may be hard to read here but suffice to say a lot of factors go into an individual's decision to perform a new behavior and most or all of these factors should be considered by architects of SC interventions.  Leadership can provide input to many of these factors (through communication, modeling desired behavior, including decision making) and thus facilitate (or impede) desired behavioral changes.



From Montaño and Kaspryzk
Resistance to change can be wide-spread.  Effective leadership is critical to overcoming resistance and implementing successful cultural changes.  “. . . leaders in formal organizations have the power and responsibility to set strategy and direction, align people and resources, motivate and inspire people, and ensure that problems are identified and solved in a timely manner.” (p. 54)

Lessons from initiatives to create other specific organizational cultures

The authors review the literature on learning organizations, total quality management and quality organizations, and sustainable organizations for lessons applicable to SC initiatives.  They observe that this literature “is quite consistent in emphasizing the importance of recognizing that organizations are multi-level, dynamic systems whose elements are related in complex and multi-faceted ways, and that culture mirrors this dynamic complexity, despite its role in socializing individuals, maintaining stability, and resisting change.” (p. 61)

“The studies conducted on learning, quality, and sustainable organizations and their corresponding cultures contain some badly needed information about the relationship among various traits, organizational characteristics, and behaviors that could help inform the assessment of safety cultures and the design and evaluation of interventions.” (p. 65)  Topics mentioned include management leadership and commitment, trust, respect, shared vision and goals, and a supportive learning environment.

Designing and evaluating targeted interventions 


This section emphasizes the potential value of the evaluation science*** approach (used primarily in health care) for the nuclear industry.  The authors go through the specific steps for implementing the evaluation science model, drilling down in spots to describe additional tools, such as logic modeling (to organize and visualize issues, interventions and expected outcomes), that can be used.  There is a lot of detail here including suggestions for how the NRC might use backward mapping and a review of licensee logic models to evaluate SC assessment and intervention efforts.  Before anyone runs off to implement this approach, there is a major caveat:

“The literature on the design, implementation, and evaluation of interventions to address identified shortcomings in an organization’s safety culture is sparse; there is more focus on creating a safety culture than on intervening to correct identified problems.” (p. 67)

Relation to Schein

Schein's model of culture (shown on p. 8) and prescriptions for interventions are the construct most widely known to the nuclear industry and its SC practitioners.  His work is mentioned throughout the PNNL report.  Schein assumes that cultural change is a top-down effort (so leadership plays a key role) focused on individuals.  Change is implemented using an unfreeze—replace/move—refreeze strategy.  Schein's model is recommended in the program theory-driven evaluation science approach.  The authors believe Schein's “description of organizational culture and change does one of the best jobs of conveying the “cultural” dimensions in a way that conveys its embeddedness and complexity.” (p. 108)  The authors note that Schein's cultural levels interact in complex ways, requiring a systems approach that relates the levels to each other, SC to the larger organizational culture, and culture to overall organizational functioning.

So if you're acquainted with Schein you've got solid underpinnings for reading this report even if you've never heard of any of the over 300 principal authors (plus public agencies and private entities) mentioned therein.  If you want an introduction to Schein, we have posted on his work here and here.

Conclusion

This is a comprehensive and generally readable reference work.  SC practitioners should read the executive summary and skim the rest to get a feel for the incredible number of theorists, researchers and institutions who are interested in organizational culture in general and/or SC in particular.  The report will tell you what a culture consists of and how you might go about changing it.

We have a few quibbles.  For example, there are many references to systems but very little to what we call systems thinking (an exception is Senge's mention of systems thinking on p. 58 and systems approach on p. 59).  There is no recognition of the importance of feedback loops.

The report refers multiple times to the dynamic interaction of the factors that comprise a SC but does not provide any model of those interactions.  There is limited connectivity between potentially successful interventions and desired changes in observable artifacts.  In other words, this literature review will not tell you how to improve your plant's decision making process or corrective action program, resolve goal conflicts or competing priorities, align management incentives with safety performance, or reduce your backlogs.


*  K.M. Branch and J.L. Olson, “Review of the Literature Pertinent to the Evaluation of Safety Culture Interventions” (Richland, WA: Pacific Northwest National Laboratory, Dec. 2011).  ADAMS ML13023A054

**  The authors note “The NRC safety culture traits could also be characterized as social norms.” (p. 28)

***  “. . . evaluation science focuses on helping stakeholders diagnose organization and social needs, design interventions, monitor intervention implementation, and design and implement an evaluation process to measure and assess the intended and unintended consequences that result as the intervention is implemented.” (p. 69)

Wednesday, January 30, 2013

Talking Sheep at Palisades

In Lewis Carroll’s Through the Looking Glass, Alice and the White Queen advance into the chessboard's fifth rank by crossing over a brook together, but at the very moment of the crossing, the Queen transforms into a talking sheep.  Alice soon finds herself struggling to handle the oars of a small rowboat, where the Sheep annoys her with nonsensical shouting.  Now consider the NRC’s Nov. 9, 2012 followup inspection report* at Palisades related to the DC panel event and the Service Water pump coupling failure.  It brings to mind a similar picture - in this case inspectors struggling to propel a small rowboat of substance on a river of nonsensical jargon and bureaucratese.

Reading this inspection report (IR) reveals endless repetition of process details and findings of other reports, and astonishingly little substance or basis for the inspectors' current findings and conclusions.  The IR “assesses” the findings of the Palisades root cause analysis and associated extent of condition and corrective actions.  The discussion is deeply ingrained with yellow findings, white findings, crosscutting this and cornerstone that, a liberal dose of safety culture traits and lots of significance determinations.  Frankly it’s hard to even remember what started the whole thing.  Perhaps of most interest, the IR notes  that much of the Palisades management team was replaced in the period since these two events.
(p. 23)  Why?  Were they deemed incompetent? Unwilling to implement appropriate risk and safety priorities?  Or just sacrificial lambs? (more sheep).  It appears that these changes carried significant weight with the NRC inspectors although it is not specifically stated. 

Then there is this set of observations:

“During interviews the inspectors heard that there were concerns about staffing levels in multiple departments, but the site was aware and was actively working with Entergy corporate management to post and fill positions. . . Entergy Corporate was perceived by many on the site to be stifling progress in filling positions.  The many issues at Palisades and staffing problems have contributed to the organization becoming more reactive to addressing maintenance and equipment reliability issues versus being proactive in addressing possible problems.” (p. 23)

Which is it?  The site was actively working with Entergy or Entergy was stifling progress in filling positions?  Without further amplification or justification the IR delivers its conclusion: “The inspection team concluded the safety culture was adequate and improving.” (p. 24, emphasis added)  There is no discussion of how or on what basis the inspectors reached this conclusion.  In particular the finding of “improving” is hard to understand as it does not appear that this inspection team had previously assessed the safety culture at the site.

At one point the IR stumbles into a revealing and substantive issue that could provide significant insight into the problems at Palisades.  It describes another event at the plant with a lot of similarities to the DC panel. 

“The inspection team focused inspection efforts on ... an occurrence when, on May 14, 2012, workers erroneously placed a wire jumper between 115 Volt AC and 125 Volt DC circuits ...many of the actions and behaviors exhibited by the workers involved were similar in nature to the loss of DC bus event that occurred in September 2011...Those similar behaviors included the lack of a pre-job brief and discussion regarding the limitations of the work scope, workers taking action outside of the scope allowed by ‘toolpouch maintenance,’ supervisors failing to adequately challenge the workers, and workers proceeding in the face of uncertainty when unexpected conditions arose.” (p. 21)

So far so good.

“Many of the supervisors and managers the inspection team interviewed stated that the May 2012 near-miss was not a repeat event of the September 2011 event because the May 2012 near-miss involved only a handful of individuals, whereas the September 2011 occurrence involved multiple individuals across multiple organizations at Palisades. The inspectors agreed that the May 2012 near-miss involved fewer individuals, but there were individuals from several organizations involved in the near-miss. The inspectors concluded that the RCE assessment was narrow in that it stated only the field work team failed to internalize the cause and corrective actions from the September 2011 DC bus event. The inspectors concluded that other individuals, including the WCC SRO, CRS, and a non-licensed plant operator also exhibited behaviors similar to those of the September 2011 DC bus event.” (p. 21)

Still good but starting to wonder if the Palisades supervisors and managers really got the lessons learned from September 2011.

“The inspectors determined that, while the May 2012 near-miss shared some commonalities with the September 2011 event, the two conditions were not the result of the same basic causes. The inspectors reached this conclusion because the May 2012 near-miss did not result in a significant plant transient [emphasis added] and also did not exhibit the same site wide, organizational breakdowns in risk recognition and management that led to the September 2011 event.” (pp. 21-22)

Whoops.  First, what is the relevance of the outcome of the May 2012 event?  Why is it being alluded to as a cause?  Are the inspectors saying that if in September 2011 the Palisades personnel took exactly the actions they took but had the good fortune not to let the breaker stab slip it would not be a significant safety event?  

With regard to the extent of organizational breakdown, in the prior paragraph the inspectors had pushed back on this rationale - but now conclude the May 2012 event is different because it was not “site-wide”.  It is not clear how you square these arguments particularly if one goes back to the original root cause of  the DC panel event: 

“...senior leaders had not established a sufficiently sensitive culture of risk recognition and management, which resulted in the plant’s managers, supervisors, and workers not recognizing, accounting for, or preparing for the industrial safety risk and plant operational nuclear risk…” (p. 1) and, quoting from the licensee root cause analysis “site leadership at all levels was not sufficiently intrusive into work on panel ED-11-2.” (p. 13)

It is hard to see how the May 2012 event didn’t exhibit these same causes.  In addition, the “Why Staircase” in the Palisades root cause analysis (p. 21) does not identify or allude to the extent of involvement of multiple organizations - at all.  While we do not believe that such linear, “why” thinking is adequate for a complex system, it is the basis for what Palisades found and what the NRC inspectors accepted.

We’re not really sure what to make of this inspection effort.  On its face it doesn’t provide much of a basis for its conclusion that the safety culture is adequate and improving.  Perhaps the real basis is the new management team?  Or perhaps the NRC doesn’t really have many options in this situation.  If the current inspection found the weaknesses not to have been resolved, what could the NRC do?  Is there such a thing as an “inadequate” safety culture?  Or just safety culture that need improvement?  It seems the NRC’s safety culture construct has created a Looking Glass-like inversion of reality - maybe a convenient trope within the agency but increasingly a baffling and unsatisfying distraction to achieving competent nuclear safety management. 

Bottom line:  The NRC close out inspection is a baaaad report.


*  S. West (NRC) to A. Vitale (Entergy), “Palisades Nuclear Plant - NRC Supplemental Inspection Report 05000255/2012011; and Assessment Follow-up Letter” (Nov. 9, 2012) ADAMS ML12314A304.

Friday, January 25, 2013

Safety Culture Assessments: the Vit Plant vs. Other DOE Facilities

The Vit Plant
 As you recall, the Defense Nuclear Facilities Safety Board (DNFSB) set off a little war with DOE when DNFSB published its blistering June 2011 critique* of the Hanford Waste Treatment Plant's (Vit Plant) safety culture (SC).  Memos were fired back and forth but eventually things settled down.  One of DOE's resultant commitments was to assess SC at other DOE facilities to see if  SC concerns identified at the Vit Plant were also evident elsewhere.  Last month DOE transmitted the results of five assessments to DNFSB.**  The following facilities were evaluated:

• Los Alamos National Laboratory Chemistry and Metallurgy Research Replacement Project (Los Alamos)
• Y-12 National Security Complex Uranium Processing Facility Project (UPF)
• Idaho Cleanup Project Sodium Bearing Waste Treatment Project (Idaho)
• Office of Environmental Management Headquarters (EM)
• Pantex Plant
 


The same protocol was used for each of the assessments: DOE's Health, Safety and Security organization formed a team of its own assessors and two outside experts from the Human Performance Analysis Corporation (HPA).  Multiple data collection tools, including functional analysis, semi-structured focus group and individual interviews, observations and behavioral anchored rating scales, were used to assess organizational behaviors.  The external experts also conducted a SC survey at each site.

A stand-alone report was prepared for each facility, consisting of a summary and recommendation (ca. 5 pages) and the outside experts' report (ca. 25 pages).  The outside experts organized their observations and findings along the nine SC traits identified by the NRC, viz.,

• Leadership Safety Values and Actions
• Problem Identification and Resolution
• Personal Accountability
• Work Processes
• Continuous Learning
• Environment for Raising Concerns
• Effective Safety Communication
• Respectful Work Environment
• Questioning Attitude.

So, do Vit Plant SC concerns exist elsewhere?

That's up to the reader to determine.  The DOE submittal contained no meta-analysis of the five assessments, and no comparison to Vit Plant concerns.  As far as I can tell, the individual assessments made no attempt to focus on whether or not Vit Plant concerns existed at the reviewed facilities.

However, my back-of-the-envelope analysis (no statistics, lots of inference) of the reports suggests there are some Vit Plant issues that exist elsewhere but not to the degree that riled the DNFSB when it looked at the Vit Plant.  I made no effort to distinguish between issues mentioned by federal versus contractor employees, or by different contractors.  Following are the major Vit Plant concerns, distilled from the June 2011 DNFSB letter, and their significance at other facilities.

Schedule and/or budget pressure that can lead to suppressed issues or safety short-cuts
 

This is the most widespread and frequently mentioned concern.  It appears to be a significant issue at the UPF where the experts say “the project is being driven . . . by a production mentality.”  Excessive focus on financial incentives was also raised at UPF.  Some Los Alamos interviewees reported schedule pressure.  So did some folks at Idaho but others said safety was not compromised to make schedule; financial incentives were also mentioned there.  At EM, there were fewer comments on schedule pressure and at Pantex, interviewees opined that management shielded employees from pressure and tried to balance the message that both safety and production are important.

A chilled atmosphere adverse to safety exists

The atmosphere is cool at some other facilities, but it's hard to say the temperature is actually chilly.  There were some examples of perceived retaliation at Los Alamos and Pantex.  (Two Pantex employees reported retaliation for raising a safety concern; that's why Pantex, which was not on the original list of facilities for SC evaluation, was included.)  Fear of retaliation, but not actual examples, was reported at UPF and EM.  Fear of retaliation was also reported at Pantex. 

Technical dissent is suppressed

This is a minor issue.  There were some negative perceptions of the differing professional opinion (DPO) process at Los Alamos.  Some interviewees thought the DPO process at EM could be better utilized.  The experts said DPO needed to be better promoted at Pantex. 

Processes for raising and resolving SC-related questions exist but are neither trusted nor used

Another minor issue.  The experts said the procedures at Los Alamos should be reevaluated and enforced.

Conclusion

I did not read every word of this 155 page report but it appears some facilities have issues akin to those identified at the Vit Plant but their scope and/or intensity generally appear to be less.

The DOE submittal is technically responsive to the DNFSB commitment but is not useful without further analysis.  The submittal evidences more foot dragging by DOE to cover up the likely fact that the Vit Plant's SC problems are more significant than other facilities' and buy time to attempt to correct those problems.


* Defense Nuclear Facilities Safety Board, Recommendation 2011-1 to the Secretary of Energy "Safety Culture at the Waste Treatment and Immobilization Plant" (Jun 9, 2011).  We have posted on the DOE-DNFS imbroglio here, here and here.
   
**  G.S. Podansky (DOE) to P.S. Winokur (DNFSB), letter transmitting five independent safety culture assessments (Dec. 12, 2012).

Monday, January 21, 2013

May Day

This is another in our series of posts following up the Upper Big Branch coal mine disaster in April 2010. As reported in the Wall Street Journal* a former superintendent in the Massey Energy mine, Gary May, was sentenced to 21 months in prison for his part in the accident. Specifically May “warned miners that inspectors were coming and ordered subordinates to falsify a record book and disable a methane monitor so workers could keep mining coal.”

The U.S. attorney in charge of the case is basing criminal indictments on a conspiracy that he believes “certainly went beyond Upper Big Branch.” In other words the government is working its way up the food chain at Massey with lower level managers such as May pleading guilty and cooperating with prosecutors. The developments here are worth keeping an eye on as it is relatively rare to see the string pulled so extensively in cases of safety failures at the operating level. The role and influence of senior executives will ultimately come under scrutiny and their culpability determined not on the slogans they promulgated but on their actions.


* K. Maher, “Former Mine OfficialSentenced to 21 Months,” Wall Street Journal (Jan. 17, 2013).

Thursday, January 17, 2013

Adm. Hyman Rickover – Systems Thinker

The TMI-2 accident occurred in 1979. In 1983 the plant owner, General Public Utilities Corp. (GPU), received a report* from Adm. Hyman Rickover (the “Father of the Nuclear Navy”) recommending that GPU be permitted by the NRC to restart the undamaged TMI Unit 1 reactor. We are not concerned with the report's details or conclusions but one part caught our attention.

The report begins by describing Rickover's seven principles for successful nuclear operation. One of these principles is the “Concept of Total Responsibility” which he explains as follows: “Operating nuclear plants safely requires adherence to a total concept wherein all elements are recognized as important and each is constantly reinforced. Training, equipment maintenance, technical support, radiological control, and quality control are essential elements but safety is achieved through integrating them effectively in operating decisions.” (p. 9, emphasis added)

We think the foregoing sounds like version 1.0 of points we have been emphasizing in this blog, namely:
  • Performance over time is the result of relationships and interactions among organizational components, in other words, the system is what's important.
  • Decisions are where the rubber meets the road in terms of goals, priorities and resource allocation; the extant safety culture provides a context for decision-making.
  • Safety performance is an emergent organizational property, a result of system activities, and cannot be predicted by examining individual system components.
We salute Adm. Rickover for his prescient insights.


* Adm. H.G. Rickover, “An Assessment of the GPU Nuclear Corporation Organization and Senior Management and Its Competence to Operate TMI-1” (Nov. 19, 1983). Available from Dickinson College library here.

Thursday, January 10, 2013

NRC Non-Regulation of Safety Culture: Fourth Quarter Update

NRC SC Brochure ML113490097
On March 17, July 3 and October 17, 2012 we posted on NRC safety culture (SC) related activities with individual licensees. This post highlights selected NRC actions during the fourth quarter, October through December 2012. We report on this topic to illustrate how the NRC squeezes plants on SC even if the agency is not officially regulating SC.

Prior posts mentioned Browns Ferry, Fort Calhoun and Palisades as plants where the NRC was undertaking significant SC-related activities. It appears none of those plants has resolved its SC issues.

Browns Ferry

An NRC supplemental inspection report* contained the following comment on a licensee root cause analysis: “Inadequate emphasis on the importance of regulatory compliance has contributed to a culture which lacks urgency in the identification and timely resolution of issues associated with non-compliant and potentially non-conforming conditions.” Later, the NRC observes “This culture change initiative [to address the regulatory compliance issue] was reviewed and found to still be in progress. It is a major corrective action associated with the upcoming 95003 inspection and will be evaluated during that inspection.” (Two other inspection reports, both issued November 30, 2012, noted the root cause analyses had appropriately considered SC contributors.)

An NRC-TVA public meeting was held December 5, 2012 to discuss the results of the supplemental inspections.** Browns Ferry management made a presentation to review progress in implementing their Integrated Improvement Plan and indicated they expected to be prepared for the IP 95003 inspection (which will include a review of the plant's third party SC assessment) in the spring of 2013.

Fort Calhoun

SC must be addressed to the NRC’s satisfaction prior to plant restart. The NRC's Oct. 2, 2012 inspection report*** provided details on the problems identified by the Omaha Public Power District (OPPD) in the independent Fort Calhoun SC assessment, including management practices that resulted “. . . in a culture that valued harmony and loyalties over standards, accountability, and performance.”

Fort Calhoun's revision 4 of its improvement plan**** (the first revision issued since Exelon took over management of the plant in September, 2012) reiterates management's previous commitments to establishing a strong SC and, in a closely related area, notes that “The Corrective Action Program is already in place as the primary tool for problem identification and resolution. However, CAP was not fully effective as implemented. A new CAP process has been implemented and root cause analysis on topics such as Condition Report quality continue to create improvement actions.”

OPPD's progress report***** at a Nov. 15, 2012 public meeting with the NRC includes over two dozen specific items related to improving or monitoring SC. However, the NRC restart checklist SC items remain open and the agency will be performing an IP 95003 inspection of Fort Calhoun SC during January-February, 2013.^

Palisades

Palisades is running but still under NRC scrutiny, especially for SC. The Nov. 9, 2012 supplemental inspection report^^ is rife with mentions of SC but eventually says “The inspection team concluded the safety culture was adequate and improving.” However, the plant will be subject to additional inspection efforts in 2013 to “. . . ensure that you [Palisades] are implementing appropriate corrective actions to improve the organization and strengthen the safety culture on site, as well as assessing the sustainability of these actions.”

At an NRC-Entergy public meeting December 11, Entergy's presentation focused on two plant problems (DC bus incident and service water pump failure) and included references to SC as part of the plant's performance recovery plan. The NRC presentation described Palisades SC as “adequate” and “improving.”^^^

Other Plants

NRC supplemental inspections can require licensees to assess “whether any safety culture component caused or significantly contributed to” some performance issue. NRC inspection reports note the extent and adequacy of the licensee’s assessment, often performed as part of a root cause analysis. Plants that had such requirements laid on them or had SC contributions noted in inspection reports during the fourth quarter included Braidwood, North Anna, Perry, Pilgrim, and St. Lucie. Inspection reports that concluded there were no SC contributors to root causes included Kewaunee and Millstone.

Monticello got a shout-out for having a strong SC. On the other hand, the NRC fired a shot across the bow of Prairie Island when the NRC PI&R inspection report included an observation that “. . . while the safety culture was currently adequate, absent sustained long term improvement, workers may eventually lose confidence in the CAP and stop raising issues.”^^^^ In other words, CAP problems are linked to SC problems, a relationship we've been discussing for years.

The NRC perspective and our reaction

Chairman Macfarlane's speech to INPO mentioned SC: “Last, I would like to raise “safety culture” as a cross-cutting regulatory issue. . . . Strengthening and sustaining safety culture remains a top priority at the NRC. . . . Assurance of an effective safety culture must underlie every operational and regulatory consideration at nuclear facilities in the U.S. and worldwide.”^^^^^

The NRC claims it doesn't regulate SC but isn't “assurance” part of “regulation”? If NRC practices and procedures require licensees to take actions they might not take on their own, don't the NRC's activities pass the duck test (looks like a duck, etc.) and qualify as de facto regulation? To repeat what we've said elsewhere, we don't care if SC is regulated but the agency should do it officially, through the front door, and not by sneaking in the back door.


*  E.F. Guthrie (NRC) to J.W. Shea (TVA), “Browns Ferry Nuclear Plant NRC Supplemental Inspection Report 05000259/2012014, 05000260/2012014, 05000296/2012014” (Nov. 23, 2012) ADAMS ML12331A180.

**  E.F. Guthrie (NRC) to J.W. Shea (TVA), “Public Meeting Summary for Browns Ferry Nuclear Plant, Docket No. 50-259, 260, and 296” (Dec. 18, 2012) ADAMS ML12353A314.

***  M. Hay (NRC) to L.P. Cortopassi (OPPD), “Fort Calhoun - NRC Integrated Inspection Report Number 05000285/2012004” (Oct. 2, 2012) ADAMS ML12276A456.

****  T.W. Simpkin (OPPD) to NRC, “Fort Calhoun Station Integrated Performance Improvement Plan, Rev. 4” (Nov. 1, 2012) ADAMS ML12311A164.

*****  NRC, “Summary of November 15, 2012, Meeting with Omaha Public Power District” (Dec. 3, 2012) ADAMS ML12338A191.

^  M. Hay (NRC) to L.P. Cortopassi (OPPD), “Fort Calhoun Station – Notification of Inspection (NRC Inspection Report 05000285/2013008 ” (Dec. 28, 2012) ADAMS ML12363A175.

^^  S. West (NRC) to A. Vitale (Entergy), “Palisades Nuclear Plant - NRC Supplemental Inspection Report 05000255/2012011; and Assessment Follow-up Letter” (Nov. 9, 2012) ADAMS ML12314A304.

^^^  O.W. Gustafson (Entergy) to NRC, Entergy slides to be presented at the December 11, 2012 public meeting (Dec. 7, 2012) ADAMS ML12342A350. NRC slides for the same meeting ADAMS ML12338A107.

^^^^  K. Riemer (NRC) to J.P. Sorensen (NSP), “Prairie Island Nuclear Generating Plant, Units 1 and 2; NRC Biennial Problem Identification and Resolution Inspection Report 05000282/2012007; 05000306/2012007” (Sept. 25, 2012) ADAMS ML12269A253.

^^^^^  A.M. Macfarlane, “Focusing On The NRC Mission: Maintaining Our Commitment to Safety” speech presented at the INPO CEO Conference (Nov. 6, 2012) ADAMS ML12311A496.

Thursday, January 3, 2013

The ETTO Principle: Efficiency-Thoroughness Trade-Off by Erik Hollnagel

This book* was suggested by a regular blog visitor. Below we provide a summary of the book followed by our assessment of how it comports with our understanding of decision making, system dynamics and safety culture.

Hollnagel describes a general principle, the efficiency-thoroughness trade-off (ETTO), that he believes almost all decision makers use. ETTO means that people and organizations routinely make choices between being efficient and being thorough. For example, if demand for production is high, thoroughness (time and other resources spent on planning and implementing an activity) is reduced until production goals are met. Alternatively, if demand for safety is high, efficiency (resources spent on production) is reduced until safety goals are met. (pp. 15, 28) Greater thoroughness is associated with increased safety.

ETTO is used for many reasons, including resource limitations, the need to maintain resource reserves, and social and organizational pressure. (p. 17) In practice, people use shortcuts, heuristics and rationalizations to make their decision-making more efficient. At the individual level, there are many ETTO rules, e.g., “It will be checked later by someone else,” “It has been checked earlier by someone else,” and “It looks like a Y, so it probably is a Y.” At the organizational level, ETTO rules include negative reporting (where the absence of reporting implies that everything is OK), cost reduction imperatives (which increase efficiency at the cost of thoroughness), and double-binds (where the explicit policy is “safety first” but the implicit policy is “production takes precedence when goal conflicts arise”). The use of any of these rules can lead to a compromise of safety. (pp. 35-36, 38-39) As decision makers ETTO, individual and organizational performance varies. Most of the time, things work out all right but sometimes failures occur. 

How do failures occur? 

Failures can happen when people, going about their work activities in a normal manner, create a series of ETTOs that ultimately result in unacceptable performance. These situations are more likely to occur the more complex and closely coupled the work system is. The best example (greatly simplified in the following) is an accident victim who arrived at an ER just before shift change on a Friday night. Doctor A examined her, ordered a head scan and X-rays and communicated with the surgery, ICU and radiology residents and her relief, Doctor B; Doctor B transferred the patient to the ICU, with care to be provided by the ICU and surgery residents; these residents and other doctors and staff provided care over the weekend. The major error was that everyone thought somebody else would read the patient's X-rays and make the correct diagnosis or, in the case of radiology doctors, did not carefully review the X-rays. On Monday, the rad tech who had taken the X-rays on Friday (and noticed an injury) asked the orthopedics resident about the patient; this resident had not heard of the case. Subsequent examination revealed that the patient had, along with her other injuries, a dislocated hip. (pp. 110-113) The book is populated with many other examples. 

Relation to other theorists 

Hollnagel refers to sociologist Charles Perrow, who believes some errors or accidents are unavoidable in complex, closely-coupled socio-technical organizations.** While Perrow used the term “interactiveness” (familiar vs unfamiliar) to grade complexity, Hollnagel updates it with “tractability” (knowable vs unknowable) to reflect his belief that in contemporary complex socio-technical systems, some of the relationships among internal variables and between variables and outputs are not simply “not yet specified” but “not specifiable.”

Both Hollnagel and Sydney Dekker identify with a type of organizational analysis called Resilience Engineering, which believes complex organizations must be designed to safely adapt to environmental pressure and recover from inevitable performance excursions outside the zone of tolerance. Both authors reject the linear, deconstructionist approach of fault-finding after incidents or accidents, the search for human error or the broken part. 

Assessment 

Hollnagel is a psychologist so he starts with the individual and then extends the ETTO principle to consider group or organizational behavior, finally extending it to the complex socio-technical system. He notes that such a system interacts with, attempts to control, and adapts to its environment, ETTOing all the while. System evolution is a strength but also makes the system more intractable, i.e., less knowable, and more likely to experience unpredictable performance variations. He builds on Perrow in this area but neither is a systems guy and, quite frankly, I'm not convinced either understands how complex systems actually work.

I feel ambivalence toward Hollnagel's thesis. Has he provided a new insight into decision making as practiced by real people, or has he merely updated terminology from earlier work (most notably, Herbert Simon's “satisficing”) that revealed that the “rational man” of classical economic theory really doesn't exist? At best, Hollnagel has given a name to a practice we've all seen and used and that is of some value in itself.

It's clear ETTO (or something else) can lead to failures in a professional bureaucracy, such as a hospital. ETTO is probably less obvious in a nuclear operating organization where “work to the procedure” is the rule and if a work procedure is wrong, then there's an administrative procedure to correct the work procedure. Work coordination and hand-offs between departments exhibit at least nominal thoroughness. But there is still plenty of room for decision-making short cuts, e.g., biases based on individual experience, group think and, yes, culture. Does a strong nuclear safety culture allow or tolerate ETTO? Of course. Otherwise, work, especially managerial or professional work, would not get done. But a strong safety culture paints brighter, tighter lines around performance expectations so decision makers are more likely to be aware when their expedient approaches may be using up safety margin.

Finally, Hollnagel's writing occasionally uses strained logic to “prove” specific points, the book needs a better copy editor, and my deepest suspicion is he is really a peripatetic academic trying to build a career on a relatively shallow intellectual construct.


* E. Hollnagel, The ETTO Principle: Efficiency-Thoroughness Trade-Off (Burlington, VT: Ashgate, 2009).

** C. Perrow, Normal Accidents: Living with High-Risk Technologies (New York: Basic Books, 1984).