Thursday, June 6, 2013

Implementing Safety Culture Policy Part 2

This post continues our discussion of the implementation of safety culture policy in day-to-day nuclear management decision making, started in our post dated April 9, 2013.   In that post we introduced several parameters for quantitatively scoring decisions: decision quality, safety significance and significance uncertainty.  At this time we want to update the decision quality label, using instead “decision balance”.

To illustrate the application of the scoring method we used a set of twenty decisions based on issues taken from actual U.S. nuclear operating experience, typically those that were reported in LERs.  As a baseline, we scored each issue for safety significance and uncertainty.  Each issue identified 3 to 4 decision options for addressing the problem - and each option was annotated with the potential impacts of the decision on budgets, generation (e.g. potential outage time) and the corrective action program.   We scored each decision option for its decision balance (how well the decision option balances safety priority) and then identified the preferred decision option for each issue.  This constitutes what we refer to as the “preferred decision set”.  A pdf file of one example issue with decision choices and scoring inputs is available here

Our assumption is that the preferred decision set would be established/approved by senior management based on their interpretation of the issues and their expectations for how organizational decisions should reflect safety culture.  The set of issues would then be used in a training environment for appropriate personnel.  For purposes of this example, we incorporated the preferred decision set into our NuclearSafetySim* simulator to illustrate the possible training experience.  The sim provides an overall operational context tracking performance for cost, plant generation and CAP program and incorporating performance goals and policies.

Chart 1
In the sim application a trainee would be tasked with assessing an issue every three months over a 60 month operational period.  The trainee would do this while attempting to manage performance results to achieve specified goals.  For each issue the trainee would review the issue facts, assign values for significance and uncertainty, and select a decision option.  Chart 1 compares the actual decisions (those by the trainee) to those in the preferred set for our prototype session.   Note that approximately 40% of the time the actual decision matched the preferred decision (orange data points).  For the remainder of the issues the trainee’s selected decisions differed.  Determining and understanding why the differences occurred is one way to gain insight into how culture manifests in management actions.

As we indicated in the April 9 post, each decision is evaluated for its safety significance and uncertainty in accordance with quantified scales.  These serve as key inputs to determining the appropriate balance to be achieved in the decision.  In prior work in this area, reported in our posts dated July 15, 2011 and October 14, 2011 we solicited readers to score two issues for safety significance.  The reported scores ranged from 2 to 10 (most scores between 4 to 6) for one issue and ranged 5 to 10 (most scores 6 to 8) for the other issue.  This reflects the reality that perceptions of safety significance are subject to individual differences.  In the current exercise, similar variations in scoring were expected and led to differences between the trainee’s scores and the preferred decision set.  The variation may be due to the inherent subjective nature of assessing these attributes and other factors such as experience, expertise, biases, and interpretations of the issue.  So this could be one source of difference in the trainee decision selections versus the preferred set, as the decision process attempts to match action to significance. 

Another source could be in the decision options themselves.   The decision choice by a trainee could have focused on what the trainee felt was the “best” (i.e., most efficacious) decision versus an explicit consideration of safety priority commensurate with safety significance.  Additionally decision choices may have been influenced by their potential impacts, particularly under conditions where performance was not on track to meet goals. 


Chart 2
Taking this analysis a bit further, we looked at how decision balance varied over the course of the simulation.  As discussed in our April 9 post we use decision balance to create a quantitative measure of how well the goal of safety culture is being incorporated in a specific decision - the extent to which the decision accords the priority for safety commensurate with its safety significance.  In the instant exercise, each decision option for each issue has been assigned a balance value as part of the preferred scoresheet.**  Chart 2 shows a timeline of decision balances - one for the preferred decision set and the other for the actual decisions made by the trainee.  A smoothing function has been applied to the discrete values of balance to provide a continuous track. 

The plots illustrate how decision balance may vary over time, with specific decisions reflecting greater or lesser emphasis on safety.  During the first half of the sim the decision balances are in fairly close agreement, reflecting in part that in 5 of 8 cases the actual decisions matched the preferred decisions.  However in the second half of the sim significant differences emerge, primarily in the direction of weaker balances associated with the trainee decisions.  Again, understanding why these differences emerge could provide insight into how safety culture is actually being practiced within the organization. Chart 3 adds in some additional context.

Chart 3
The yellow line is a plot of “goal pressure” which is simply a sum of the differences in actual performance in the sim to goals for cost, generation and CAP program.  Higher values of pressure are associated with performance lagging the goals.  Inspection of the plot indicates that goal pressure was mostly modest in the first half of the sim before an initial spike up and further increases with time.  The blue line, the decision balance of the trainee, does not show any response to the initial spike, but later in the sim the high goal pressure could be seen as a possible contributor to decisions trending to lower balances.  A final note is that over the course of the entire sim, the average values of preferred and actual balance are fairly close for this player, perhaps suggesting reasonable overall alignment in safety priorities notwithstanding decision to decision variations. 

A variety of training benefits can flow from the decision simulation.  Comparisons of actual to preferred decisions provide a baseline indication of how well expected safety balances are being achieved in realistic decisions.  Consideration of contributing factors such as goal pressure may illustrate challenges for decision makers.  Comparisons of results among and across groups of trainees could provide further insights.  In all cases the results would provide material for discussion, team building and alignment on safety culture.

In our post dated November 4, 2011 we quoted the work of Kahneman, that organizations are “factories for producing decisions”.  In nuclear safety, the decision factory is the mechanism to actualize safety culture into specific priorities and actions.  A critical element of achieving strong safety culture is to be able to identify differences between espoused values for safety (i.e., the traits typically associated with safety culture) and de facto values as revealed in actual decisions. We believe this can be achieved by capturing decision data explicitly, including the judgments on significance and uncertainty, and the operational context of the decisions.

The next step is synthesizing the decision and situational parameters to develop a useful systems-based measure of safety culture.  A quantity that could be tracked in a simulation environment to illustrate safety culture response and provide feedback and/or during nuclear operations to provide a real time pulse of the organization’s culture.



* For more information on using system dynamics to model safety culture, please visit our companion website, nuclearsafetysim.com.

** It is possible for some decision options to have the same value of balance even though they incorporate different responses to the issue and different operational impacts. 

Friday, May 31, 2013

When the Big Dogs Refuse to Learn New Tricks

A regular reader asks: The middle managers at my facility think learning is great—for their subordinates.  How can I get them to accept that learning new ideas, viz., contemporary safety culture (SC) concepts or approaches, is also good for them?

This is a tough nut to crack but we'll give it a shot.  We'll take a brief look at why these managers resist changing their own behavior, i.e., find it hard to learn.  Then we'll present some strategies for engaging them in an effort to modify their attitudes and behavior.

Why managers resist learning 


Management theorists and practical experience provide some reasons why this problem arises.  We'll start with the individual and then look at the work group.

A classic article by management theorist Chris Argyris* provides a good starting point (although it's not a perfect fit with the question at hand).  Argyris' basic argument is as follows: Many highly skilled “professionals are almost always successful at what they do, . . . because they have rarely failed, they have never learned how to learn from failure. . . . they become defensive, screen out criticism, and the put the “blame” on anyone and everyone but themselves.” (p. 100)  In practice, they engage in defensive reasoning, guided by their true (as opposed to espoused) values: maintain control; maximize winning and minimize losing; suppress negative feelings and avoid open conflicts; and be as “rational” as possible, i.e., define clear objectives but allow limited freedom of choice in strategy selection. 

In other words, past successes breed confidence in managers to continue doing what they've been doing and construct defenses against unwelcome, potentially threatening changes.

In the work group, these managers sit on top of their silos, controlling the careers and paychecks of those below them.  Recognized success in achieving assigned goals provides feedback that one's managerial approach is effective and correct.  Managers manage and subordinates comply.  If subordinates' performance is deficient, then progressive discipline is used to correct the situation.  Step one or two of a typical program is to require subordinates to learn (or relearn) applicable policies, programs, procedures, etc.  If the manager is dealing with such a situation, then he's OK and it's the subordinate who has a problem.  More success breeds more confidence.

What can the change agent do?


Maybe very little.  You can't fix stupid, but you can (occasionally) fix arrogance or over-confidence if the proper opportunity appears, or you can create it.  Opportunities may arise from the external environment, within the work group or internal to the manager.  Some candidate challenges include those confronting the entire organization (e.g. risk of shutdown), the work group (e.g., scoring significantly lower than other groups on the latest SC survey) or the individual manager (e.g., a new, unfamiliar boss, a top-down SC initiative or a review of his compensation plan).

What you're looking for is a teachable moment when the manager faces a significant challenge, preferably one that places him at existential risk in the organization.  Why?  Because comfortable people are only motivated to change when it's perceived as less dangerous than staying put.

You'll need an action plan to promote your value-add to the resolution of the particular challenge.  And an “elevator speech” to promote your plan when you bump into the boss coming out of the executive washroom. 

What if you don't have a crisis handy? 


Then you need to go hunting for a candidate to “help.”  We know that many middle managers are satisfied with their positions and content to remain at that level for the balance of their careers.  They are transactional not transformational characters.  But some are more ambitious, they may exhibit what psychologist David McClelland called a high nAch, the need for Achievement.  And, as Maslow taught us, unsatisfied needs motivate behavior.  So these managers seek to prove their superior worth to their bosses, perhaps by undertaking some new task or program initiative.  If you can identify one of these people, you can work on educating him on the value of SC to his career; the goal is to get him to champion, promote and model SC.  And you need to talk to the senior managers to help get your champion recognized, rewarded, promoted or at least made more promotable.

In short, you can wait in the bushes, biding your time, until opportunities come along or you can try to initiate change or act as a change catalyst.  You don't need Sigmund Freud or Margaret Mead to help you figure out what to do.  You need patience, an action plan and the will to jump on opportunities when they arise.


*  C. Argyris, “Teaching Smart People to Learn,” Harvard Business Review (May-June, 1991), pp. 99-109.  The same Safetymatters reader who asked the initial question also recommended this article.  Kudos to him.

Argyris is perhaps best known for his concepts of single-loop and double-loop learning.  In single loop learning, actions are designed to achieve specified goals and questions or conflicts about governing variables are suppressed.  In double loop learning, the governing variables are subjected to scrutiny and, if necessary, actions are taken to attempt to transform them.  This can broaden the range of strategic choices for achieving the original goals.  Single loop learning leads to defensive reasoning, double loop learning reflects productive reasoning.  Productive reasoning is characterized by dissemination and consideration of complete information, minimal defensiveness and open confrontation of difficult issues.

Friday, May 24, 2013

How the NRC Regulates Safety Culture

We have long griped about the back door regulation of safety culture (SC) in the U.S.  This post describes how the NRC gets to and through the back door.  (Readers familiar with the NRC regulatory process can skip this post.  If we get it wrong, please let us know.)

Oversight of Reactor Operations*

The Action Matrix

The NRC's Operating Reactor Assessment Program collects information from inspections (baseline and supplemental) and performance indicators (PIs) to develop conclusions about a licensee's safety performance.  Depending on the results of the NRC's assessment, a plant is assigned to a column in the Action Matrix, a table that categorizes various levels of plant performance and, for each level, identifies required and optional NRC and licensee actions.

The Action Matrix has five columns; the safety significance of plant problems increases as one goes from column 1 to column 5.  Plants in column 1 receive the NRC baseline inspection program, plants in columns 2-4 receive increasing NRC attention and licensee requirements and plants in column 5 have unacceptable performance and are not allowed to operate.

SC first becomes a consideration in column 2 when the NRC conducts a Supplemental Inspection using  IP 95001.  Licensees are expected to identify the causes of identified problems, including the contribution of any SC-related components, and place the problems in the plant's corrective action program (CAP).  NRC inspectors determine if the licensee's causal evaluations appropriately considered SC components and if any SC issues were identified that the corrective action is sufficient to address the SC issue(s).  If not, then the inspection report is kept open until the licensee takes sufficient corrective action.
   
For a plant in column 3, the licensee is again expected to identify the causes of identified problems, including the contribution of any SC-related components, and place the problems in the plant's CAP.  NRC inspectors independently determine if SC components caused or significantly contributed to the identified performance problems.  If inspectors cannot make an independent determination (e.g., the licensee does not perform a SC analysis) the inspection is kept open until the licensee takes sufficient corrective action.

If the NRC concludes SC deficiencies caused or significantly contributed to the performance issues, and the licensee did not recognize it, the NRC may request that the licensee complete an independent** SC assessment.  In other words, it is an NRC option.

For plants in column 4 or 5, the licensee is expected to have a third-party** SC assessment performed.  The NRC will evaluate the third-party SC assessment and independently perform a graded assessment of the licensee's SC.  Inspectors can use the results from the licensee's third party SC assessment to satisfy the inspection requirements if the staff has completed a validation of the third party SC methodology and related factors.  If the inspectors conduct their own assessment, the scope may range from focusing on functional groups or specific SC components to conducting a complete SC assessment 

Significant Cross-Cutting Issues

The NRC evaluates performance for seven cornerstones that reflect the essential safety aspects of plant operation.  Some issues arise that cross two or more cornerstones and result in a Significant Cross-Cutting Issue (SCCI) in the areas of Human Performance, Problem Identification and Resolution or Safety Conscious Work Environment.  Each SCCI has constituent components, e.g., the components of Human Performance are Decision-making, Resources, Work control and Work practices.  Each component is characterized, e.g., for Decision-making “Licensee decisions demonstrate that nuclear safety is an overriding priority” and has defining attributes, e.g., “The licensee makes safety-significant or risk-significant decisions using a systematic process, . . . uses conservative assumptions . . . [and] communicates decisions and the basis for decisions . . .” 

There are other components which are not associated with cross-cutting areas: Accountability, Continuous learning environment, Organizational change management and Safety policies.

Most important for our purpose, the NRC says the cross-cutting components and other components comprise the plant's SC components.

Thus, by definition analysis and remediation of SCCIs involve SC, sometimes directly.  For example, in the third consecutive assessment letter identifying the same SCCI, the NRC would typically request the licensee to perform an independent SC assessment.  (Such a request may be deferred if the licensee has made reasonable progress in addressing the issue but has not yet met the specific SCCI closure criteria.)

SCCIs are included with plants' annual and mid-cycle assessment letters.  Dana Cooley, a nuclear industry consultant, publishes a newsletter that summarizes new, continuing, closed and avoided SCCIs from the plant assessment letters.  The most recent report*** describes 15 new and continuing SCCIs, involving 6 plants.  Two plants (Browns Ferry and Susquehanna) have specific SC assessment requirements.

Our perspective

The NRC issued its SC Policy on June 14, 2011.  “The Policy Statement clearly communicates the Commission’s expectations that individuals at organizations performing or overseeing regulated activities establish and monitor a positive safety culture commensurate with the safety and security significance of their activities and the nature and complexity of their organizations and functions.”****

The SC Policy may be new to NRC licensees that do not operate nuclear reactors but as detailed above, the NRC's “expectations” have been codified in the operating reactor inspections for years.  (The SC language for the Action Matrix and SCCIs was added in 2006.)

Technically, there is no NRC regulation of SC because there are no applicable regulations.  As a practical matter, however, because the NRC can dig into (or force the licensees to dig into) possible SC contributions to safety-significant problems, then require licensees to fix any identified SC issues there is de facto regulation of SC.  SC is effectively regulated because licensees are forced to expend resources (time, money, personnel) on matters they might not otherwise pursue.

Because there is no direct, officially-recognized regulation of SC, it appears a weak SC alone will not get a plant moved to a more intrusive column of the Action Matrix.  However, failure to demonstrate a strong or strengthening SC can keep a plant from being promoted to a column with less regulatory attention.

Why does the industry go along with this system?  They probably fear that official regulation of SC might be even more onerous.  And it might be the camel's nose in the tent on NRC evaluation of licensee management competence, or looking at management compensation plans including performance incentives.  That's where the rubber meets the road on what is really important to a plant's corporate owners. 


*  This post is a high-level summary of material in the NRC Inspection Manual, Ch. 0305 “Operating Reactor Assessment Program” (Jun. 13, 2012), Ch. 0310 “Components Within the Cross-Cutting Areas” (Oct. 28, 2011) and NRC Inspection Procedures 95001 (Feb. 9, 2011), 95002 (Feb. 9, 2011) and 95003 (Feb. 9, 2011).  Many direct quotes are included but quotation marks have not been used in an effort to minimize clutter.

**  An independent SC assessment is performed by individuals who are members of licensee's organization but have no direct authority and have not been responsible for any of the areas being evaluated.  A third-party SC assessment is performed by individuals who are not members of the licensee's organization.  (IMC 0305, p. 4)

***  D.E. Cooley (SeaState Group), “NRC Reactor Oversight Program, Substantive Cross-Cutting Issues, 2012 Annual Assessment Letters, March 4, 2013 Data.” 

****  From the NRC website http://www.nrc.gov/about-nrc/regulatory/enforcement/safety-culture.html#programs

Wednesday, May 15, 2013

IAEA on Instituting Regulation of Licensee Safety Culture

The International Atomic Energy Agency (IAEA) has published a how-to report* for regulators who want to regulate their licensees' safety culture (SC).  This publication follows a series of meetings and workshops, some of which we have discussed (here and here).  The report is related to IAEA projects conducted “under the scope of the Regional Excellence Programme on Safe Nuclear Energy–Norwegian Cooperation Programme with Bulgaria and Romania. These projects have been implemented at the Bulgarian and Romanian regulatory bodies” (p. 1)

The report covers SC fundamentals, regulatory oversight features, SC assessment approaches, data collection and analysis.  We'll review the contents, highlighting IAEA's important points, then provide our perspective.

SC fundamentals

The report begins with the fundamentals of SC, starting with Schein's definition of SC and his tri-level model of artifacts, espoused values and basic assumptions.  Detail is added with a SC framework based on IAEA's five SC characteristics:

  • Safety is a clearly recognized value
  • Leadership for safety is clear
  • Accountability for safety is clear
  • Safety is integrated into all activities
  • Safety is learning driven.
The SC characteristics can be described using specific attributes.

Features of regulatory oversight of SC 


This covers what the regulator should be trying to achieve.  It's the most important part of the report so we excerpt the IAEA's words.

“The objective of the regulatory oversight of safety culture, focused on a dynamic process, is to consider and address latent conditions that could lead to potential safety performance degradation at the licensees’ nuclear installations. . . . Regulatory oversight of safety culture complements compliance-based control [which is limited to looking at artifacts] with proactive control activities. . . . ” (p. 6, emphasis added)

“[R]egulatory oversight of safety culture is based on three pillars:

Common understanding of safety culture. The nature of safety culture is distinct from, and needs to be dealt with in a different manner than a compliance-based control. . . .

Dialogue. . . . dialogue is necessary to share information, ideas and knowledge that is often qualitative. . . .

Continuousness. Safety culture improvement needs continuous engagement of the licensee. Regulatory oversight of safety culture therefore ideally relies on a process during which the regulator continuously influences the engagement of the licensee.” (p. 7)

“With regards to safety culture, the regulatory body should develop general requirements and enforce them in order to ensure the authorized parties have properly considered these requirements. On the other hand, the regulatory body should avoid prescribing detailed level requirements.” (p. 8)  The licensee always has the primary responsibility for safety.

Approaches for assessing SC

Various assessment approaches are currently being used or reviewed by regulatory bodies around the world. These approaches include: self-assessments, independent assessments, interaction with the licensee at a senior level, focused safety culture on-site reviews, oversight of management systems and integration into regulatory activities.  Most of these activities are familiar to our readers but a couple merit further definition.  The “management system” is the practices, procedures and people.**  “Integration into regulatory activities” means SC-related information is also collected during other regulatory actions, e.g., routine or special inspections.

The report includes a table (recreated below) summarizing, for each assessment approach, the accuracy of results and resources required.  Accuracy is judged as realistic, medium or limited and resource requirements as high, medium and low.  The table thus shows the relative strengths and weaknesses of each approach.





Criteria

Approaches Accuracy of SC picture Effort Management involvement Human and Organizational Factors & SC skills





Self-assessment Medium Low (depending on Low Medium
Review
who initiates the
(to understand
(high experience and
self-assessment,
deliverables)
skills of the
regulator or

reviewers are
licensee)

assumed)








Independent Medium Low Low Medium
assessment Review


(to understand
(high experience and


deliverables)
skills of the



reviewers are



assumed)








Interaction with the Limited (however Medium High Medium
Licensee at Senior can support a


Level shared



understanding)







Focused Safety Realistic (gives High Medium High
Culture On-Site depth in a moment


Review of time)







Oversight of Medium (Reduced Low Low Medium
Management System if only formal


Implementation aspects are



considered)







Integration into Medium (when Medium (after an Medium (with an Medium (specific
Regulatory properly trended intensive initial intensive initial training
Activities and analyzed) introduction) support) requirement and




experience sharing)




Data collection, analysis and presenting findings to the licensee

The report encourages regulators to use multiple assessment approaches and multiple data collection methods and data sources.  Data collection methods include observations; interviews; reviews of events, licensee documents and regulator documents; discussions with management; and other sources such as questionnaires, surveys, third-party documents and focus groups.  The goal is to approach the target from multiple angles.  “The aim of data analysis is to build a safety culture picture based on the inputs collected. . . . It is a set of interpreted data regarding the organizational practices and the priority of safety within these practices. (p. 17)

Robust data analysis “requires iterations [and] multi-disciplinary teams. A variety of expertise (technical, human and organizational factors, regulations) are necessary to build a reliable safety culture picture. . . . [and] protect against bias inherent to the multiple sources of data.” (p. 17)

The regulator's picture of SC is discussed with the licensee during periodic or ad hoc meetings.  The objective is to reach agreement on next steps, including the implementation of possible meeting actions and licensee commitments.

Our perspective

The SC content is pretty basic stuff, with zero new insight.  From our viewpoint, the far more interesting issue is the extension of regulatory authority into an admittedly soft, qualitative area.  This issue highlights the fact that the scope of regulatory authority is established by decisions that have socio-political, as well as technical, components.  SC is important, and certainly regulatable.  If a country wants to regulate nuclear SC, then have at it, but there is no hard science that says it is a necessary or even desirable thing to do.

Our big gripe is with the hypocrisy displayed by the NRC which has a SC policy, not a regulation, but in some cases implements all the steps associated with regulatory oversight discussed in this IAEA report (except evaluation of management personnel).  For evidence, look at how they have been pulling Fort Calhoun and Palisades through the wringer.


*  G. Rolina (IAEA), “Regulatory oversight of safety culture in nuclear installations” IAEA TECDOC 1707 (Vienna: International Atomic Energy Agency, 2013).

**  A management system is a “set of interrelated or interacting elements (system) for establishing policies and objectives and enabling the objectives to be achieved in an efficient and effective way. . . . These elements include the structure, resources and processes. Personnel, equipment and organizational culture as well as the documented policies and processes are parts of the management system.” (p. 30)

Wednesday, May 8, 2013

Safety Management and Competitiveness

Jean-Marie Rousseau
We recently came across a paper that should be of significant interest to nuclear safety decision makers.  “Safety Management in a Competitiveness Context” was presented in March 2008 by Jean-Marie Rousseau of the Institut de Radioprotection et de Surete Nucleaire (IRSN).  As the title suggests the paper examines the effects of competitive pressures on a variety of nuclear safety management issues including decision making and the priority accorded safety.  Not surprisingly:

“The trend to ignore or to deny this phenomenon is frequently observed in modern companies.” (p. 7)

The results presented in the paper came about from a safety assessment performed by IRSN to examine safety management of EDF [Electricite de France] reactors including:

“How real is the ‘priority given to safety’ in the daily arbitrations made at all nuclear power plants, particularly with respect to the other operating requirements such as costs, production, and radiation protection or environmental constraints?” (p. 2)

The pertinence is clear as “priority given to safety” is the linchpin of safety culture policy and expected behaviors.  In addition the assessment focused on decision-making processes at both the strategic and operational levels.  As we have argued, decisions can provide significant insights into how safety culture is operationalized by nuclear plant management. 

Rousseau views nuclear operations as a “highly complex socio-technical system” and his paper provides a brief review of historical data where accidents or near misses displayed indications of the impact of competing priorities on safety.  The author notes that competitiveness is necessary just as is safety and as such it represents another risk that must be managed at organizational and managerial levels.  This characterization is intriguing and merits further reflection particularly by regulators in their pursuit of “risk informed regulation”.  Nominally regulators apply a conceptualization of risk that is hardware and natural phenomena centric.  But safety culture and competitive pressures also could be justified as risks to assuring safety - in fact much more dynamic risks - and thus be part of the framework of risk informed regulation.*  Often, as is the case with this paper, there is some tendency to assert that achievement of safety is coincident with overall performance excellence - which in a broad sense it is - but notwithstanding there are many instances where there is considerable tension - and potential risk.

Perhaps most intriguing in the assessment is the evaluation of EDF’s a posteriori analyses of its decision making processes as another dimension of experience feedback.**   We quote the paper at length:

“The study has pointed out that the OSD***, as a feedback experience tool, provides a priori a strong pedagogic framework for the licensee. It offers a context to organize debates about safety and to share safety representations between actors, illustrated by a real problematic situation. It has to be noticed that it is the only tool dedicated to “monitor” the safety/competitiveness relationship.

"But the fundamental position of this tool (“not to make judgment about the decision-maker”) is too restrictive and often becomes “not to analyze the decision”, in terms of results and effects on the given situation.

"As the existence of such a tool is judged positively, it is necessary to improve it towards two main directions:
- To understand the factors favouring the quality of a decision-making process. To this end, it is necessary to take into account the decision context elements such as time pressure, fatigue of actors, availability of supports, difficulties in identifying safety requirements, etc.
- To understand why a “qualitative decision-making process” does not always produce a “right decision”. To this end, it is necessary to analyze the decision itself with the results it produces and the effects it has on the situation.” (p. 8)

We feel this is a very important aspect that currently receives insufficient attention.  Decisions can provide a laboratory of safety management performance and safety culture actualization.  But how often are decisions adequately documented, preserved, critiqued and shared within the organization?  Decisions that yield a bad (reportable) result may receive scrutiny internally and by regulators but our studies indicate there is rarely sufficient forensic analysis - cause analyses are almost always one dimensional and hardware and process oriented.  Decisions with benign outcomes - whether the result of “good” decision making or not - are rarely preserved or assessed.  The potential benefits of detailed consideration of decisions have been demonstrated in many of the independent assessments of accidents (Challenger, Columbia, BP Texas Oil Refinery, etc.) and in research by Perin and others. 

We would go a step further than proposed enhancements to the OSD.  As Rousseau notes there are downsides to the routine post-hoc scrutiny of actual decisions - for one it will likely identify management errors even in the absence of a bad decision outcome.  This would be one more pressure on managers already challenged by a highly complex decision environment.  An alternative is to provide managers the opportunity to “practice” making decisions in an environment that supports learning and dialogue on achieving the proper balances in decisions - in other words in a safety management simulator.  The industry requires licensed operators to practice operations decisions on a simulator for similar reasons - why not nuclear managers charged with making safety decisions?



*  As the IAEA has noted, “A danger of concentrating too much on a quantitative risk value that has been generated by a PSA [probabilistic safety analysis] is that...a well-designed plant can be operated in a less safe manner due to poor safety management by the operator.”  IAEA-TECDOC-1436, Risk Informed Regulation of Nuclear Facilities: Overview of the Current Status, February 2005.

**  EDF implemented safety-availability-Radiation-Protection-environment observatories (SAREOs) to increase awareness of the arbitration between safety and other performance factors. SAREOs analyze in each station the quality of the decision-making process and propose actions to improve it and to guarantee compliance with rules in any circumstances [“Nuclear Safety: our overriding priority” EDF Group‟s file responding to FTSE4Good nuclear criteria] 


***  Per Rousseau, “The OSD (Observatory for Safety/Availability) is one of the “safety management levers” implemented by EDF in 1997. Its objective is to perform retrospective analyses of high-stake decisions, in order to improve decision-making processes.” (p. 7)

Friday, May 3, 2013

High Reliability Organizations and Safety Culture

On February 10th, we posted about a report covering lessons for safety culture (SC) that can be gleaned from the social science literature. The report's authors judged that high reliability organization (HRO) literature provided a solid basis for linking individual and organizational assumptions with traits and practices that can affect safety performance. This post explores HRO characteristics and how they can influence SC.

Our source is Managing the Unexpected: Resilient Performance in an Age of Uncertainty* by Karl Weick and Kathleen Sutcliffe. Weick is a leading contemporary HRO scholar. This book is clearly written, with many pithy comments, so lots of quotations are included below to present the authors' views in their own words.

What makes an HRO different?

Many organizations work with risky technologies where the consequences of problems or errors can be catastrophic, use complex management systems and exist in demanding environments. But successful HROs approach their work with a different attitude and practices, an “ongoing mindfulness embedded in practices that enact alertness, broaden attention, reduce distractions, and forestall misleading simplifications.” (p. 3)

Mindfulness

An underlying assumption of HROs is “that gradual . . . development of unexpected events sends weak signals . . . along the way” (p. 63) so constant attention is required. Mindfulness means that “when people act, they are aware of context, of ways in which details differ . . . and of deviations from their expectations.” (p. 32) HROs “maintain continuing alertness to the unexpected in the face of pressure to take cognitive shortcuts.” (p. 19) Mindful organizations “notice the unexpected in the making, halt it or contain it, and restore system functioning.” (p. 21)

It takes a lot of energy to maintain mindfulness. As the authors warn us, “mindful processes unravel pretty fast.” (p. 106) Complacency and hubris are two omnipresent dangers. “Success narrows perceptions, . . . breeds overconfidence . . . and reduces acceptance of opposing points of view. . . . [If] people assume that success demonstrates competence, they are more likely to drift into complacency, . . .” (p. 52) Pressure in the task environment is another potential problem. “As pressure increases, people are more likely to search for confirming information and to ignore information that is inconsistent with their expectations.” (p. 26) The opposite of mindfulness is mindlessness. “Instances of mindlessness occur when people confront weak stimuli, powerful expectations, and strong desires to see what they expect to see.” (p. 88)

Mindfulness can lead to insight and knowledge. “In that brief interval between surprise and successful normalizing lies one of your few opportunities to discover what you don't know.” (p. 31)**

Five principles

HROs follow five principles. The first three cover anticipation of problems and the remaining two cover containment of problems that do arise.

Preoccupation with failure

HROs “treat any lapse as a symptom that something may be wrong with the system, something that could have severe consequences if several separate small errors happened to coincide. . . . they are wary of the potential liabilities of success, including complacency, the temptation to reduce margins of safety, and the drift into automatic processing.” (p. 9)

Managers usually think surprises are bad, evidence of bad planning. However, “Feelings of surprise are diagnostic because they are a solid cue that one's model of the world is flawed.” (p. 104) HROs “Interpret a near miss as danger in the guise of safety rather than safety in the guise of danger. . . . No news is bad news. All news is good news, because it means that the system is responding.” (p. 152)

People in HROs “have a good sense of what needs to go right and a clearer understanding of the factors that might signal that things are unraveling.” (p. 86)

Reluctance to simplify

HROs “welcome diverse experience, skepticism toward received wisdom, and negotiating tactics that reconcile differences of opinion without destroying the nuances that diverse people detect. . . . [They worry that] superficial similarities between the present and the past mask deeper differences that could prove fatal.” (p. 10) “Skepticism thus counteracts complacency . . . .” (p. 155) “Unfortunately, diverse views tend to be disproportionately distributed toward the bottom of the organization, . . .” (p. 95)

The language people use at work can be a catalyst for simplification. A person may initially perceive something different in the environment but using familiar or standard terms to communicate the experience can raise the risk of losing the early warnings the person perceived.

Sensitivity to operations

HROs “are attentive to the front line, . . . Anomalies are noticed while they are still tractable and can still be isolated . . . . People who refuse to speak up out of fear undermine the system, which knows less than it needs to know to work effectively.” (pp. 12-13) “Being sensitive to operations is a unique way to correct failures of foresight.” (p. 97)

In our experience, nuclear plants are generally good in this regard; most include a focus on operations among their critical success factors.

Commitment to resilience

“HROs develop capabilities to detect, contain, and bounce back from those inevitable errors that are part of an indeterminate world.” (p. 14) “. . . environments that HROs face are typically more complex than the HRO systems themselves. Reliability and resilience lie in practices that reduce . . . environmental complexity or increase system complexity.” (p. 113) Because it's difficult or impossible to reduce environmental complexity, the organization needs to makes its systems more complex.*** This requires clear thinking and insightful analysis. Unfortunately, actual organizational response to disturbances can fall short. “. . . systems often respond to a disturbance with new rules and new prohibitions designed to present the same disruption from happening in the future. This response reduces flexibility to deal with subsequent unpredictable changes.” (p. 72)

Deference to expertise.

“Decisions are made on the front line, and authority migrates to the people with the most expertise, regardless of their rank.” (p. 15) Application of expertise “emerges from a collective, cultural belief that the necessary capabilities lie somewhere in the system and that migrating problems [down or up] will find them.” (p. 80) “When tasks are highly interdependent and time is compressed, decisions migrate down . . . Decisions migrate up when events are unique, have potential for very serious consequences, or have political or career ramifications . . .” (p. 100)

This is another ideal that can fail in practice. We've all seen decisions made by the highest ranking person rather than the most qualified one. In other words, “who is right” can trump “what is right.”

Relationship to safety culture

Much of the chapter on culture is based on the ideas of Schein and Reason so we'll focus on key points emphasized by Weick and Sutcliffe. In their view, “culture is something an organization has [practices and controls] that eventually becomes something an organization is [beliefs, attitudes, values].” (p. 114, emphasis added)

“Culture consists of characteristic ways of knowing and sensemaking. . . . Culture is about practices—practices of expecting, managing disconfirmations, sensemaking, learning, and recovering.” (pp. 119-120) A single organization can have different types of culture: an integrative culture that everyone shares, differentiated cultures that are particular to sub-groups and fragmented cultures that describe individuals who don't fit into the first two types. Multiple cultures support the development of more varied responses to nascent problems.

A complete culture strives to be mindful, safe and informed with an emphasis on wariness. As HRO principles are ingrained in an organization, they become part of the culture. The goal is a strong SC that reinforces concern about the unexpected, is open to questions and reporting of failures, views close calls as a failure, is fearful of complacency, resists simplifications, values diversity of opinions and focuses on imperfections in operations.

What else is in the book?

One chapter contains a series of audits (presented as survey questions) to assess an organization's mindfulness and appreciation of the five principles. The audits can show an organization's attitudes and capabilities relative to HROs and relative to its own self-image and goals.

The final chapter describes possible “small wins” a change agent (often an individual) can attempt to achieve in an effort to move his organization more in line with HRO practices, viz., mindfulness and the five principles. For example, “take your team to the actual site where an unexpected event was handled either well or poorly, walk everyone through the decision making that was involved, and reflect on how to handle that event more mindfully.” (p. 144)

The book's case studies include an aircraft carrier, a nuclear power plant,**** a pediatric surgery center and wildland firefighting.

Our perspective

Weick and Sutcliffe draw on the work of many other scholars, including Constance Perin, Charles Perrow, James Reason and Diane Vaughan, all of whom we have discussed in this blog. The book makes many good points. For example, the prescription for mindfulness and the five principles can contribute to an effective context for decision making although it does not comprise a complete management system. The authors' recognize that reliability does not mean a complete lack of performance variation, instead reliability follows from practices that recognize and contain emerging problems. Finally, there is evidence of a systems view, which we espouse, when the authors say “It is this network of relationships taken together—not necessarily any one individual or organization in the group—that can also maintain the big picture of operations . . .” (p. 142)

The authors would have us focus on nascent problems in operations, which is obviously necessary. But another important question is what are the faint signals that the SC is developing problems? What are the precursors to the obvious signs, like increasing backlogs of safety-related work? Could that “human error” that recently occurred be a sign of a SC that is more forgiving of growing organizational mindlessness?

Bottom line: Safetymatters says check out Managing the Unexpected and consider adding it to your library.


* K.E. Weick and K.M. Sutcliffe, Managing the Unexpected: Resilient Performance in an Age of Uncertainty, 2d ed. (San Francisco, CA: Jossey-Bass, 2007). Also, Wikipedia has a very readable summary of HRO history and characteristics.

** More on normalization and rationalization: “On the actual day of battle naked truths may be picked up for the asking. But by the following morning they have already begun to get into their uniforms.” E.A. Cohen and J. Gooch, Military Misfortunes: The Anatomy of Failure in War (New York: Vintage Books, 1990), p. 44, quoted in Managing the Unexpected, p. 31.

*** The prescription to increase system complexity to match the environment is based on the system design principle of requisite variety which means “if you want to cope successfully with a wide variety of inputs, you need a wide variety of responses.” (p. 113)

**** I don't think the authors performed any original research on nuclear plants. But the studies they reviewed led them to conclude that “The primary threat to operations in nuclear plants is the engineering culture, which places a higher value on knowledge that is quantitative, measurable, hard, objective, and formal . . . HROs refuse to draw a hard line between knowledge that is quantitative and knowledge that is qualitative.” (p. 60)