Thursday, June 16, 2016

Nuclear Safety Culture at ANO—the NRC Weighs In

Arkansas Nuclear One (credit: Edibobb)
On June 25, 2015 we posted about Arkansas Nuclear One’s (ANO) performance problems (a stator drop, inadequate flood protection and unplanned scrams) and the Nuclear Regulatory Commission’s (NRC's) reaction.  The NRC assigned ANO to column 4 of the Action Matrix where it receives the highest level of oversight for an operating plant.  As part of this increased oversight, the NRC conducted a comprehensive inspection of ANO performance, programs and processes.  A lengthy inspection report* was recently issued.

According to the NRC press release** the inspection team identified the following major issues:

“Resource reductions and leadership behaviors were the most significant causes for ANO’s declining performance. . . . ANO management did not reduce workloads through efficiencies or the elimination of unnecessary work, . . . Leaders . . . did not address expanding work backlogs***. . . . An unexpected increase in employee attrition between 2012 and 2014 caused a loss in experienced personnel, . . . Since 2007, the reduced resources created a number of changes that slowly began to impact equipment reliability.  The Entergy fleet reduced preventive maintenance and extended the time between some maintenance activities.”

The press release goes on to list numerous ANO corrective actions and NRC observations that suggest the potential for improved plant performance.

What About ANO’s Safety Culture?

The press release also mentions that the inspection team evaluated the adequacy of a 2015 Third Party Nuclear Safety Culture Assessment (TPNSCA) conducted at ANO.  The press release gives short shrift to the key role a weak safety culture (SC) played in creating ANO’s problems in the first place and the extensive SC questions raised and diagnostics performed by the NRC inspection team.

Last June, based on NRC and ANO meeting presentations, we concluded “the ANO culture endorses a “blame the contractor” attitude, accepts incomplete investigations into actual events and potential problems, and is content to let the NRC point out problems for them.”  These are serious deficiencies.  Do the same or similar problems appear in the inspection report?  To answer that question, we need to dig into the details of the 243 page report.

The Cover Letter

Top-level SC problems are included in the NRC cover letter which says “The inspection team identified what it considered to be missed opportunities for ANO to have promptly initiated performance improvements since being placed in Column 4.  More specifically, ANO: 1) was slow to implement corrective actions to address the findings from the Corrective Action Program cause evaluation and the Third Party Nuclear Safety Culture Assessment; 2) did not perform an evaluation of the causes for safety culture problems; . . .” (letter, p. 2)

Executive Summary

The report's Executive Summary says “The Third Party Nuclear Safety Culture Assessment identified that ANO personnel tolerated, and at times normalized, degraded conditions.”  Expanding on the missed opportunities comment in the cover letter, “the NRC team’s independent safety culture evaluation noted limited improvement in safety culture since the completion of ANO’s independent Third Party Nuclear Safety Culture Assessment.” (report p. 5)  “ANO did not create a specific improvement plan to address the findings of the safety culture assessments, choosing to address selected safety culture attributes that were associated with root cause evaluations rather than treating the findings in the context of a separate problem area.  By not performing a cause evaluation for safety culture, ANO management missed the opportunity to address the full scope of safety culture weaknesses.” (pp. 5-6)

Review of ANO Recovery Plan 


The NRC’s critique of ANO’s Recovery Plan included “The NRC team questioned the recovery team’s decision not to perform casual evaluations of the PAs [Problem Areas].  In response, ANO performed apparent cause evaluations (ACEs) or gap analyses for each PA.  The NRC team questioned the recovery team’s decision not to perform causal evaluations for the safety culture attributes identified in [a 2014] . . . safety culture survey, the TPNSCA, and the RCEs [Root Cause Evaluations].  The team also questioned the recovery team’s decision not to treat safety culture as a separate problem area.” (p. 21)

This is an example where the NRC was still identifying ANO’s overarching problems for the plant staff.

Review of RCEs for Fundamental Problem Areas

“ANO’s Vendor Oversight RCE identified weak implementation of administrative controls and placing undue confidence in vendor services as common cause failures. However, ANO did not assess the underlying safety culture aspects.” (p. 110, emphasis added)

This is not “blame the vendor” but is a different serious problem, viz., an over-reliance on vendor activities to protect the customer.  (This problem is not unique to ANO; it also might exist at the Waste Isolation Pilot Plant.  See our May 3, 2016 post for details.)

Inspection Report Chapter on SC

The NRC team conducted its own assessment of ANO’s SC. The NRC team interviewed personnel at all levels, conducted focus group discussions, performed behavioral observations, reviewed documents and relevant plant programs, and evaluated plant management meetings.  Overall, they assessed all ten SC traits using the full set of SC attributes contained in NRC documentation.  For each trait, the report includes its attributes, inspection team observations and findings, and relevant ANO corrective actions.

The team also reviewed seven RCEs and concluded ANO addressed the major SC attributes identified in each RCE.  However, “The NRC team noted that ANO identified that some safety culture attributes were contributors to several of the RCE problem statements, but ANO did not consider the collective significance.” (p. 184)

ANO took the hint.  “In response to the NRC team’s concerns, ANO performed a common cause analysis of all of the safety culture attributes that were identified in the recovery RCEs in order to assess the collective significance and causes.” (p. 185)  ANO developed a SC Area Action Plan (AAP) and the NRC concluded “The corrective actions identified in the NSC AAP were comprehensive and appropriate to address the causes for safety culture weaknesses.” (p. 186)

“The NRC team’s graded safety culture assessment independently confirmed the results from the TPNSCA.” (p. 188)

“The NRC team was concerned that the SCLT’s [Safety Culture Leadership Team, senior managers] conclusion that ANO’s safety culture was “adequate” in August 2015 did not appropriately reflect the data provided by, or the recommendations from, the NSCMP [Nuclear Safety Culture Monitoring Panel, mid-level personnel].  This SCLT conclusion did not reflect the declining condition with respect to safety culture and indicated a lack of awareness that improvements in safety culture at ANO were needed.”  The SCLT eventually came around and in December 2015 declared that ANO’s SC was not acceptable. (p. 192)

Our Perspective

The NRC is optimistic that ANO has correctly identified the root causes of its performance problems and has undertaken corrective actions that will ultimately prove effective.  We hope so but we’ll go with “trust but verify” on this one.  ANO still exhibits problems with incomplete analyses and leaning on the NRC to identify systemic deficiencies.

The NRC team took a good look at ANO's SC.  Quite frankly, their effort was more comprehensive than we expected.  They used an acceptable methodology for their SC assessment.  The fact that their assessment findings were consistent with the TPNSCA is not surprising.  SC evaluation is a robust social science activity and qualified SC evaluators using similar techniques should obtain generally comparable results.

We believe the NRC’s SC professionals are qualified and competent but probably encouraged to support the overall inspection findings.  The elephant in the room is that SC is a policy, not a regulation.  Would the NRC keep a plant in column 4 based solely on their belief that the plant SC is deficient?  Look at the contortions the agency performed at Palisades as that plant’s SC somehow went from weak, with constant problems, to “improving” and, we inferred, acceptable.  (See our Jan. 30, 2013 post for details.)

There may have been a bit of similar magical thinking at ANO.  In the inspection report, every SC trait had examples of shortcomings but also had “appropriate” corrective actions to improve performance.****  How can this be when ANO (and Entergy) have been so slow to grasp the systemic nature of their SC problems?

Let’s close on a different note.  Earlier this year ANO named a full-time SC manager, a person whose background is in plant security.  On the surface, this is an “unfiltered” choice.  (See our March 10, 2016 post for a discussion of filtering in personnel decisions.)  He may be exactly the type of person ANO needs to make SC improvements happen.  We wish him well.


*  M. L. Dapas (NRC) to J. Browning (ANO), “Arkansas Nuclear One – NRC Supplemental Inspection Report 05000313/2016007 and 05000368/2016007” (June 9, 2016).  ADAMS ML16161B279.

**  V. Dricks, Press Release, “NRC Issues Comprehensive Inspection Report on Arkansas Nuclear One” (June 13, 2015).

***  We have often noted that large backlogs, especially of safety-related work, are an artifact of a weak SC.

****  One trait was judged to have no significant issues so corrective action was not needed.

Tuesday, June 7, 2016

The Criminalization of Safety (Part 3)


Our Perspective

The facts and circumstances of the events described in Table 1 in Part 1 point to a common driver - the collision of business and safety priorities, with safety being compromised.  Culture is inferred as the “cause” in several of the events but with little amplification or specifics.[1]  The compromises in some cases were intentional, others a product of a more complex rationalization.  The events have been accompanied by increased criminal prosecutions with varied success. 

We think it is fair to say that so far, criminalization of safety performance does not appear to be an effective remedy.  Statutory limitations and proof issues are significant limitations with no easy solution. The reality is that criminalization is at its core a “disincentive”.  To be effective it would have to deter actions or decisions that are not consistent with safety but not create a minefield of culpability.  It is also a blunt instrument requiring rather egregious behavior to rise to the level of criminality.  Its best use is probably as an ultimate boundary, to deter intentional misconduct but not be an unintended trap for bad judgment or inadequate performance.  In another vein, criminalization would also seem incompatible with the concept of a “just culture” other than for situations involving intentional misconduct or gross negligence.

Whether effective or not, criminalization reflects the urgency felt by government authorities to constrain excessive risk taking, intentional or not, and enhance oversight.  It is increasingly clear that current regulatory approaches are missing the mark.  All of the events catalogued in Table 1 occurred in industries that are subject to detailed safety and environmental regulation.  After the fact assessments highlight missed opportunities for more assertive regulatory intervention, and in the Flint cases there are actual criminal charges being applied to regulators.  The Fukushima event precipitated a complete overhaul of the nuclear regulatory structure in Japan, still a work in progress.  Post hoc punishments, no matter how severe, are not a substitute.

Nuclear Regulation Initiatives

Looking specifically at nuclear regulation in the U.S. we believe several specific reforms should be considered. It is always difficult to reform without the impetus of a major safety event, but we could see these actions as ones that could appear obvious in a post-event assessment if there was ever an “O-ring” moment in the nuclear industry.[2]

1. The NRC should include the safety management system in its regulatory activities.

The NRC has effectively constructed a cordon sanitaire around safety management by decreeing that “management” is beyond the scope of regulation.  The NRC relies on the fact that licensees bear the primary responsibility for safety and the NRC should not intrude into that role.  If one contemplates the trend of recent events scrutinizing the performance of regulators following safety events, this legalistic “defense” may not fare well in a situation where more intrusive regulation could have made the difference.

The NRC does monitor “safety culture” and often requires licensees to address weaknesses in culture following performance issues.  In essence safety culture has become an anodyne for avoiding direct confrontation of safety management issues.  Cynically one could say it is the ultimate conspiracy - where regulators and “stakeholders” come together to accept something that is non-contentious and conveniently abstract to prevent a necessary but unwanted (apparently by both sides) intrusion into safety management.

As readers of this blog know, our unyielding focus has been on the role of the complex socio-technical system that functions within a nuclear organization to operate nuclear plants effectively and safely.  This management system includes many drivers, variables, feedbacks, culture, and time delays in its processes, not all of which are explicit or linear.  The outputs of the system are the actions and decisions that ultimately produce tangible outcomes for production and safety.  Thus it is a safety system and a legitimate and necessary area for regulation.

NRC review of safety management need not focus on traditional management issues which would remain the province of the licensee.  So organizational structure, personnel decisions, etc. need not be considered.[3]  But here we should heed the view of Daniel Kahneman where he suggests we think of organizations as “factories for producing decisions” and therefore, think of decisions as a product.  (See our Nov. 4,2011 post, A Factory for Producing Decisions.)  Decisions are in fact the key product of the safety management system.  Regulatory focus on how the management system functions and the decisions it produces could be an effective and proactive approach.

We suggest two areas of the management system that could be addressed as a first priority: (1) Increased transparency of how the management system produces specific safety decisions including the capture of objective data on each such decision, and (2) review of management compensation plans to minimize the potential for incentives to promote excessive risk taking in operations.

2. The NRC should require greater transparency in licensee management decisions with potential safety impacts.

Managing nuclear operations involves a continuum of decisions balancing a variety of factors including production and safety.  These decisions may occur with individuals or with larger groups in meetings or other forums.  Some may involve multiple reviews and concurrences.  But in general the details of decision making, i.e., how the sausage is made, are rarely captured in detail during the process or preserved for later assessment.[4]  Typically only decisions that happen to yield a bad outcome (e.g., prompt the issuance of an LER or similar) become subject to more intensive review and post mortem.  Or actions that require specific, advance regulatory approval and require an SER or equivalent.[5]  

Transparency is key.  Some say the true test of ethics is what people do when no one is looking.  Well the converse of that may also be true - do people behave better when they know oversight is or could be occurring?  We think a lot of the NRC’s regulatory scheme is already built on this premise, relying as it does on auditing licensee activities and work products.

Thinking back to the Davis Besse example, the criminal prosecutions of both the corporate entity and individuals were limited to providing false or incomplete information to the NRC.  There was no attempt to charge on the basis of the actual decisions to propose, advocate for, and attempt to justify, that the plant could continue to operate beyond the NRC’s specified date for corrective actions.  The case made by First Energy was questionable as presented to the NRC and simply unjustified when accounting for the real facts behind their vessel head inspections.

Transparency would be served by documenting and preserving the decision process on safety significant issues.  These data might include the safety significance and applicable criteria, the potential impact on business performance (plant output, cost, schedule, etc), alternatives considered, and the participants and their inputs to the decision making process, and how a final decision was reached.   These are the specifics that are so hard or impossible to reproduce after the fact.[6]  The not unexpected result: blaming someone or something but not gaining insight into how the management system failed.

This approach would provide an opportunity for the NRC to audit decisions on a routine basis.  Licensee self assessment would also be served through safety committee review and other oversight including INPO.  Knowing that decisions will be subject to such scrutiny also can promote careful balancing of factors in safety decisions and serve to articulate how those balances are achieved and safety is served.  Having such tangible information shared throughout the organization could be the strongest way to reinforce the desired safety culture.

3. As part of its regulation of the safety management system, the NRC should restrict incentive compensation for nuclear management that is based on meeting business goals.

We started this series of posts focusing on criminalization of safety.  One of the arguments for more aggressive criminalization is essentially to offset the powerful pull of business-based incentives with the fear of criminal sanctions.  This has proved to elusive.  Similarly attempting to balance business incentives with safety incentives also is problematic.  The Transocean experience illustrates that quite vividly.[7]

Our survey several years ago of nuclear executive compensation indicated (1) the amounts of compensation are very significant for the top nuclear executives, (2) the compensation is heavily dependent on each years performance, and (3) business performance measured by EPS is the key to compensation, safety performance is a minor contributor.  A corollary to the third point might be that in no cases that we could identify was safety performance a condition precedent or qualification for earning the business-based incentives. (See our July 9, 2010 post, Nuclear Management Compensation (Part 2)).  With 60-70% of total compensation at risk, executives can see their compensation, and that of the entire management team, impacted by as much as several million dollars in a year.  Can this type of compensation structure impact safety?  Intuition says it creates both risk and a perception problems.  Virtually every significant safety event in Table 1 has reference to the undue influence of production priorities on safety.  The issue was directly raised in at least one nuclear organization[8] which revised its compensation system to avoid undermining safety culture. 

We believe a more effective approach is to minimize the business pressures in the first place.  We believe there is a need for a regulatory policy that discourages or prohibits licensee organizations from utilizing significant incentives based on financial performance.  Such incentives invariably target production and budget goals as they are fundamental to business success.  To the extent safety goals are included they are a small factor or based on metrics that do not reflect fundamental safety.  Assuring safety is the highest priority is not subject to easily quantifiable and measurable metrics - it is judgmental and implicit in many actions and decisions taken on a day-to-day basis at all levels of the organization.  Organizations should pay nuclear management competitively and generously and make informed judgments about their overall performance.

Others have recognized the problem and taken similar steps to address it.  For example, in the aftermath of the financial crisis of 2008 the Federal Reserve Board has been doing some arm twisting with U.S. financial services companies to adjust their executive compensation plans - and those plans are in fact being modified to cap bonuses associated with achieving performance goals. (See our April 25, 2013 post, Inhibiting Excessive Risk Taking by Executives.)

Nick Taleb (of Black Swan fame) believes that bonuses provide an incentive to take risks.  He states, “The asymmetric nature of the bonus (an incentive for success without a corresponding disincentive for failure) causes hidden risks to accumulate in the financial system and become a catalyst for disaster.”  Now just substitute “nuclear operations” for “the financial system”.

Central to Talebs thesis is his belief that management has a large informational advantage over outside regulators and will always know more about risks being taken within their operation. (See our Nov. 9, 2011 post, Ultimate Bonuses.)  Eliminating the force of incentives and providing greater transparency to safety management decisions could reduce risk and improve everybody’s insight into those risks deemed acceptable.

Conclusion

In industries outside the commercial nuclear space, criminal charges have been brought for bad outcomes that resulted, at least in part, from decisions that did not appropriately consider overall system safety (or, in the worst cases, simply ignored it.)  Our suggestions are intended to reduce the probability of such events occurring in the nuclear industry.





[1] It raises the question whether anytime business priorities trump safety it is a case of deficient culture.  We have argued in other blog posts that sufficiently high business or political pressure can compromise even a very strong safety culture.  So reflexive resort to safety culture may be easy but not be very helpful.
[2] Credit to Adam Steltzner author of The Right Kind of Crazy recounting his and other engineers’ roles in the design of the Mars rovers.  His reference is to the failure of O-ring seals on the space shuttle Challenger.
[3] We do recognize that there are regulatory criteria for general organizational matters such as for the training and qualification of personnel. 
[4] In essence this creates a “safe harbor” for most safety judgments and to which the NRC is effectively blind.
[5] In Davis Besse much of the “proof” that was relied on in the prosecutions of individuals was based on concurrence chains for key documents and NRC staff recollections of what was said in meetings.  There was no contemporaneous documentation of how First Energy made its threshold decision that postponing the outage was acceptable, who participated, and who made the ultimate decision.  Much was made of the fact that management was putting great pressure on maintaining schedule but there was no way to establish how that might have directly affected decision making.
[6] Kahneman believes there is “hindsight bias”.  Hindsight is 20/20 and it supposedly shows what decision makers could (and should) have known and done instead of their actual decisions that led to an unfavorable outcome, incident, accident or worse.  We now know that when the past was the present, things may not have been so clear-cut.  See our Dec.18, 2013 post, Thinking, Fast and Slow by Daniel Kahneman.
[7] Transocean, owner of the Deepwater Horizon oil rig, awarded millions of dollars in bonuses to its executives after “the best year in safety performance in our companys history,” according to an annual report…’Notwithstanding the tragic loss of life in the Gulf of Mexico, we achieved an exemplary statistical safety record as measured by our total recordable incident rate and total potential severity rate.’”  See our April 7, 2011 post for the original citation in Transocean's annual report and further discussion.
[8] “The reward and recognition system is perceived to be heavily weighted toward production over safety”.  The reward system was revised "to ensure consistent health of NSC”.  See our July 29, 2010 post, NRC Decision on FPL (Part 2).