Wednesday, June 30, 2010

Can Safety Culture Be Regulated? (Part 2)

Part 1 of this topic covered the factors important to safety culture and amenable to measurement or assessment, the “known knowns.”   In this Part 2 we’ll review other factors we believe are important to safety culture but cannot be assessed very well, if at all, the “known unknowns” and the potential for factors or relationships important to safety culture that we don’t know about, the “unknown unknowns.”

Known Unknowns

These are factors that are probably important to regulating safety culture but cannot be assessed or cannot be assessed very well.  The hazard they pose is that deficient or declining performance may, over time, damage and degrade a previously adequate safety culture.

Measuring Safety Culture

This is the largest issue facing a regulator.  There is no meter or method that can be applied to an organization to obtain the value of some safety culture metric.  It’s challenging (impossible?) to robustly and validly assess, much less regulate, a variable that cannot be measured.  For a more complete discussion of this issue, please see our June 15, 2010 post

Trust

If the plant staff does not trust management to do the right thing, even when it costs significant resources, then safety culture will be negatively affected.  How does one measure trust, with a survey?  I don’t think surveys offer more than an instantaneous estimate of any trust metric’s value.

Complacency

Organizations that accept things as they are, or always have been, and see no opportunity or need for improvement are guilty of complacency or worse, hubris.  Lack of organizational reinforcement for a questioning attitude, especially when the questions may result in lost production or financial costs, is a de facto endorsement of complacency.  Complacency is often easy to see a posteriori, hard to detect as it occurs.  

Management competence

Does management implement and maintain consistent and effective management policies and processes?  Is the potential for goal conflict recognized and dealt with (i.e., are priorities set) in a transparent and widely accepted manner?  Organizations may get opinions on their managers’ competence, but not from the regulator.

The NRC does not evaluate plant or owner management competence.  They used to, or at least appeared to be trying to.  Remember the NRC senior management meetings, trending letters, and the Watch List?  While all the “problem” plants had material or work process issues, I believe a contributing factor was the regulator had lost confidence in the competence of plant management.  This system led to the epidemic of shutdown plants in the 1990s.*   In reaction, politicians became concerned over the financial losses to plant owners and employees, and the Commission become concerned that the staff’s explicit/implicit management evaluation process was neither robust and nor valid.

So the NRC replaced a data-informed subjective process with the Reactor Oversight Program (ROP) which looks at a set of “objective” performance indicators and a more subjective inference of cross-cutting issues: human performance, finding and fixing problems (CAP, a known), and management attention to safety and workers' ability to raise safety issues (SCWE, part known and part unknown).  I don’t believe that anyone, especially an outsider like a regulator, can get a reasonable picture of a plant’s safety culture from the “Rope.”  There most certainly are no leading or predictive safety performance indicators in this system.

External influences

These factors include changes in plant ownership, financial health of the owner, environmental regulations, employee perceptions about management’s “real” priorities, third-party assessments, local socio-political pressures and the like.  Any change in these factors could have some effect on safety culture.

Unknown Unknowns

These are the factors that affect safety culture but we don’t know about.  While a lot of smart people have invested significant time and effort in identifying factors that influence safety culture, new possibilities can still emerge.

For example, a new factor has just appeared on our radar screen: executive compensation.  Bob Cudlin has been researching the compensation packages for senior nuclear executives and some of the numbers are eye-popping, especially in comparison to historical utility norms.  Bob will soon post on his findings, including where safety figures into the compensation schemes, an important consideration since much executive compensation is incentive-based.

In addition, it could well be that there are interactions (feedback loops and the like), perhaps varying in structure and intensity over time, between and among the known and unknown factors, that have varying impacts on the evolutionary arc of an organization’s safety culture.  Because of such factors, our hope that safety culture is essentially stable, with a relatively long decay time, may be false; safety culture may be susceptible to sudden drop-offs. 

The Bottom Line

Can safety culture be regulated?  At the current state of knowledge, with some “known knowns” but no standard approach to measuring safety culture and no leading safety performance indicators, we’d have to say “Yes, but only to some degree.”  The regulator may claim to have a handle on an organization’s safety culture through SCWE observations and indirect evidence, but we don’t think the regulator is in a good position to predict or even anticipate the next issue or incident related to safety culture in the nuclear industry. 

* In the U.S. in 1997, one couldn’t swing a dead cat without hitting a shutdown nuclear power plant.  17 units were shutdown during all or part of that year, out of a total population of 108 units. 

Tuesday, June 29, 2010

Regulatory Failure

...or how I learned to stop worrying and love the ROP.  A June 28, 2010 Wall Street Journal column titled, “Drilling for Better Information, The Financial Crisis and BP Share a Common Attribute: Regulatory Failure” (link below), while directed to the named cases, could (should) be considered by every regulatory body overseeing high risk technologies.  The highlighted quote near the end of the article provides the necessary impetus:  “When there is uncertainty about big risks, regulation ‘perpetually overshoots or undershoots its goals.’"   If true is there a legitimate question for nuclear regulation as to which it is doing?

Arguably the first issue would be, is there uncertainty about the big risks in nuclear power plant safety?  There is substantial reliance on PRA models and analyses of the plants’ hardware and safety systems and on that basis, “risk-informed” regulatory decisions are made.  The models are sophisticated, highly refined and accepted by most experts.  But does that mean we have confirmation of the risk values calculated in this manner?  Not clear.  But looking beyond the hardware to human performance, it becomes more clear that there is substantial uncertainty about this risk component.  The NRC and the industry have acknowledged both the importance of safety culture to nuclear safety and the lack of metrics to measure it.  (I suspect this will be addressed in an upcoming post by my colleague Lewis Conner when he talks about “known unknowns”.)  The ROP is the standard bearer for nuclear plant safety performance metrics but does not address culture or management performance.  The ROP indicators are almost universally green for all nuclear plants, making one wonder how well the ROP can differentiate performance.  And it would not be helpful to try to think of when ROP indicators have provided a leading signal of degrading safety performance.  The NRC seems content to assign responsibility for safety culture to licensees and regulate on the basis of outcomes.

If one concludes on this basis there is uncertainty about nuclear risks, is there a reason to believe the NRC is overshooting or undershooting?  I admit that I don’t really know.  At the most recent Commission meeting there was a colloquy regarding the value of the ROP vis-a-vis safety culture.  My recollection is that one of the NRR managers offered that the consistently good ROP metrics across the industry seemed to confirm that safety culture must be pretty good as well....then quickly amended his remark to note that no correlation between ROP and safety culture had been established.  So my concern is, does the NRC know where it is?  Undershooting or overshooting?

Monday, June 28, 2010

Can Safety Culture Be Regulated? (Part 1)

One of our recent posts questioned whether safety culture is measurable.  Now we will slide out a bit further on a limb and wonder aloud if safety culture can be effectively regulated.  We are not alone in thinking about this.  In fact, one expert has flatly stated “Since safety culture cannot be ‘regulated’, appraisal of the safety culture in operating organizations becomes a major challenge for regulatory authorities.“*

The recent incidents in the coal mining and oil drilling industries reinforce the idea that safety culture may not be amenable to regulation in the usual sense of the term, i.e., as compliance with rules and regulations based on behavior or artifacts that can be directly observed and judged.  The government can count regulatory infractions and casually observe employees, but can it look into an organization, assess what is there and then, if necessary, implement interventions that can be defended to the company, Congress and the public?

There are many variables, challenges and obstacles to consider in the effective regulation of safety culture.  To facilitate discussion of these factors, I have adapted the Rumsfeld (yes, that one) typology** and sorted some of them into “known knowns”, “unknown knowns”, and “unknown unknowns.”  The set of factors listed is intended to be illustrative and not claimed to be complete.

Known Knowns

These are factors that are widely believed to be important to safety culture and are amenable to assessment in some robust (repeatable) and valid (accurate) manner.  An adequate safety culture will not long tolerate sub-standard performance in these areas.  Conversely, deficient performance in any of these areas will, over time, damage and degrade a previously adequate safety culture.  We’re not claiming that these factors will always be accurately assessed but we’ll argue that it should be possible to do so.

Corrective action program (CAP)

This is the system for fixing problems.  Increasing corrective action backlogs, repeated occurrences of the same or similar problems, and failure to address the root causes of problems are signs that the organization can’t or won’t solve its problems.  In an adequate safety culture, the organization will fix the current instance of a problem and take steps to prevent the same or similar problems from recurring in the future.

Process reviews

The work of an organization gets done by implementing processes.  Procedural deficiencies, workarounds, and repeated human errors indicate an organization that can’t or won’t align its documented work processes with the way work is actually performed.  An important element of safety culture is that employees have confidence in procedures and processes. 

Self assessments

An adequate safety culture is characterized by few, if any, limits on the scope of assessments or the authority of assessors.  Assessments do not repeatedly identify the same or similar opportunities for improvement or promote trivial improvements (aka “rearranging the deck chairs”).  In addition, independent external evaluations are used to confirm the findings and recommendation of self assessments.

Management commitment

In an adequate safety culture, top management exhibits a real and visible commitment to safety management and safety culture.  Note that this is more limited than the state of overall management competence, which we’ll cover in part 2.

Safety conscious work environment (SCWE)

Are employees willing to make complaints about safety-related issues?  Do they fear retribution if they do so?  Are they telling the truth to regulators or surveyors?  In an adequate safety culture, the answers are “yes,” “no” and “yes.”  We are not convinced that SCWE is a true "known known" given the potential issues with the methods used to assess it (click the Safety Culture Survey label to see our previous comments on surveys and interviews) but we'll give the regulator the benefit of the doubt on this one.

A lot of information can be reliably collected on the “known knowns.”  For our purpose, though, there is a single strategic question with respect to them, viz., do the known knowns provide a sufficient dataset for assessing and regulating an organization’s safety culture?  We’ll hold off answering that question until part 2 where we’ll review other factors we believe are important to safety culture but cannot be assessed very well, if at all, and the potential for factors or relationships that are important to safety culture but we don’t even know about.

* Annick Carnino, "Management of Safety, Safety Culture and Self Assessment," Top Safe, 15-17 April 1998, Valencia, Spain.  Ms. Carnino is the former Director, Division of Nuclear Installation Safety, International Atomic Energy Agency.  This is a great paper, covering every important aspect of safety management, and reads like it was recently written.  It’s hard to believe it is over ten years old.

** NATO HQ, Brussels, Press Conference by U.S. Secretary of Defense Donald Rumsfeld, June 6, 2002. The exact quote: “There are known unknowns. That is to say, there are things we now know we don’t know. But there are also unknown unknowns.  These are the things we do not know we don’t know.”  Referenced by Errol Morris in a New York Times Opinionator article, “The Anosognosic’s Dilemma: Something’s Wrong but You’ll Never Know What It Is (Part 1)”, June 20, 2010.

Thursday, June 24, 2010

When Money Motivates

A website that has caught our attention is called “Nudge” whose focus is on improving decisions and “choice architecture”. A June 1, 2010 post titled “When Money Motivates Employees and When It Doesn’t” includes some material that bears on issues of safety culture and safety management. The post is actually a video animated presentation and runs about 11 minutes.

As is common in the business world, and increasingly in the nuclear generation business, incentives are part of many employees’ compensation packages, and in the case of senior management can be quite substantial. Incentives can be cash bonuses, stock participation awards, or similar. Incentives are tied to the achievement of certain specified performance objectives that can target broad corporate level metrics as well as more specific ones associated with a manager’s responsibilities, e.g., plant capacity factor and budget. Safety performance metrics may or may not be explicitly part of the incentive program; safety may be viewed as an absolute performance requirement and sometimes as a condition precedent to access other performance incentives.

So, as the Nudge post asks, do money incentives provide an appropriate motivation for employees? Their short answer is only under certain limited circumstances. If an employee’s responsibilities are relatively simple and straightforward, involving mechanical skills or rudimentary cognitive skills, more reward leads to more performance. On the other hand if complex or sophisticated cognitive skills or creative thinking is required, then not only is the direct connection lost, the reverse may even be true - the incentive results in poorer performance.

Now this is social “science” theory put forward by a team of economists and sociologists and appears to be a “neat” formulation. But it is a little hard to judge the validity of the findings as only minimal information regarding the studies is provided. The authors claim that many, many similar studies come to similar conclusions, so the body of research may in fact be persuasive. I would have to say that my own experience is not necessarily consistent with these findings. Incentives can be a powerful driver of results - albeit often limited to the specific results targeted by the incentives - leading to unintended consequences in other performance areas that are not targeted or that may get sacrificed in the pursuit of the targeted goals.

This leads to what I found to be the more interesting observation from the “Money Motivates” post: the researchers believe the best approach is to pay people enough to effectively “take money off the table”, allowing them to balance all relevant job priorities. The researchers concluded that people are basically “purpose maximizers” meaning that we are motivated to achieve the overarching goals and purpose of our jobs. Problems arise when purpose and profit motives are not aligned or profit is paramount. Where safety is a vital component of purpose, it is possible to see where incentives can lead to compromises.

What is interesting to us is the intersection of incentives and the related issue of competing priorities and pressures on nuclear managers when balancing safety and other business objectives. Incentives are really just a different form of pressure, individualized to personal success and gain. Are there implications for nuclear safety management? How common are incentives for nuclear managers and what specific performance goals are targeted? How does safety performance factor in and is there the potential for safety and incentives not to be aligned? Has there been any assessment of the impact of incentives in cases where safety culture problems have occurred?

Friday, June 18, 2010

Assessing Safety Culture

In our June 15th post, we reported on Wahlström and Rollenhagen’s* concern that trying to measure safety culture could do more harm than good. However, the authors go on to assert that safety culture can and should be assessed. They identify different methods that can be used to perform such assessments, including peer reviews and self assessments. They conclude “Ideally safety culture assessments should be carried out as an interaction between an assessment team and a host organization and it should be aimed at the creation of an awareness of potential safety threats . . . .” (§ 7) We certainly agree with that observation.

We are particularly interested in their comments on safety (performance) indicators, another tool for assessing safety culture. We agree that “. . . most indicators are lagging in the sense that they summarize past safety performance” (§ 6.2) and thus may not be indicative of future performance. In an effort to improve performance indicators, the authors suggest “One approach towards leading safety indicators may be to start with a set of necessary conditions from which one can obtain a reasonable model of how safety is constructed. The necessary conditions would then suggest a set of variables that may be assessed as precursors for safety. An assessment could then be obtained using an ordinal scale and several variables could be combined to set an alarm level.” (ibid.)

We believe the performance indicator problem should be approached somewhat differently. Safety culture, safety management and safety performance do not exist in a vacuum. We advocate using the principles of system dynamics to construct an organizational performance model that shows safety as both input to and output from other, sometimes competing organizational goals, resource constraints and management actions. This is a more robust approach because it can not only show that safety culture is getting stronger or slipping, but why, i.e., what other organizational factors are causing safety culture change to occur. If the culture is slipping, then analysis of system information can suggest where the most cost-effective interventions can be made. For more information on using system dynamics to model safety culture, please visit our companion website, nuclearsafetysim.com.

* Björn Wahlström, Carl Rollenhagen. Assessments of safety culture – to measure or not? Paper presented at the 14th European Congress of Work and Organizational Psychology, May 13-16, 2009, Santiago de Compostela, Spain. The authors are also connected with the LearnSafe project, which we have discussed in earlier posts (click the LearnSafe label to see them.)

Tuesday, June 15, 2010

Can Measuring Safety Culture Harm It?

That’s a question raised in a paper by Björn Wahlström and Carl Rollenhagen.* Among other issues, the authors question the reliability and validity of safety culture measurement tools, especially the questionnaires and interviews often used to assess safety culture. One problem is that such measurement tools, when applied by outsiders such as regulators, can result in the interviewees trying to game the outcome. “. . . the more or less explicit threat to shut down a badly performing plant will most likely at least in a hostile regulatory climate, bring deceit and delusion into a regulatory assessment of safety culture.” (§ 5.3)

Another potential problem is created by a string of good safety culture scores. We have often said success breeds complacency and an unjustified confidence that past results will lead to future success. The nuclear industry does not prepare for surprises yet, as the authors note, the current state of safety thinking was inspired by two major accidents, not incremental progress. (§ 5.2) Where is the next Black Swan lurking?

Surprise after success can occur on a much smaller scale. After the recent flap at Vermont Yankee, evaluators spent considerable time poring over the plant’s most recent safety culture survey to see what insight it offered into the behavior of the staff involved with the misleading report on leaking pipes. I don’t think they found much. Entergy’s law firm conducted interviews at the plant and concluded the safety culture was and is strong. See the opening paragraph for a possible interpretation.

The authors also note that if safety culture is an emergent property of an organization, then it may not be measurable at all because emergent properties develop without conscious control actions. (§ 4.2) See our earlier post for a discussion of safety culture as emergent property.

While safety culture may not be measurable, it is possible to assess it. The authors’ thoughts on how to perform useful assessments will be reviewed in a future post.

* Björn Wahlström, Carl Rollenhagen. Assessments of safety culture – to measure or not? Paper presented at the 14th European Congress of Work and Organizational Psychology, May 13-16, 2009, Santiago de Compostela, Spain. The authors are also connected with the LearnSafe project, which we have discussed in earlier posts (click the LearnSafe label to see them.)

Saturday, June 12, 2010

HBR and BP

There’s a good essay on a Harvard Business Review blog describing how decision-making in high risk enterprises may be affected by BP’s disaster in the Gulf. Not surprisingly, the author’s observations include creating a robust safety culture “where the most stringent safety management will never be compromised for economic reasons.” However, as our Bob Cudlin points out in his comment below the article, such a state may represent a goal rather than reality because safety must co-exist in the same success space as other business and practical imperatives. The real, and arguably more difficult question is: How does safety culture ensure a calculus of safety and risk so that safety measures and management are adequate for the task at hand?

Friday, June 11, 2010

Safety Culture Issue at Callaway. Or Not.

We just read a KBIA* report on the handling of an employee’s safety concern at the Callaway nuclear plant that piqued our interest. This particular case was first reported in 2009 but has not had widespread media attention so we are passing it on to you.

The history seems straightforward: an employee raised a safety concern after an operational incident, was rebuffed by management, drafted a discrimination complaint for the U.S. Dept. of Labor, received (and accepted) a $550K settlement offer from the plant owner, and went to work elsewhere. The owner claimed the settlement fell under the NRC’s Alternative Dispute Resolution (ADR) process, and the NRC agreed.

We have no special knowledge of, nor business interest in, this case. It may be a tempest in a tea pot but we think it raises some interesting questions from a safety culture perspective.

First, here is another instance where an employee feels he must go outside the organization to get attention to a safety concern. The issue didn't seem to be that significant, at most an oversight by the operators or a deficient procedure. Why couldn’t the plant safety culture process his concern, determine an appropriate resolution, and move on?

Second, why was the company so ready to pony up $550K? That is a lot of dough and seems a bit strange. Even the employee noted that it was a generous offer. It makes one wonder what else was going on in the background. To encourage licensees to participate in ADR, the NRC closes investigations into alleged discrimination against employees when an ADR settlement is reached. Can safety essentially be for sale under ADR if an owner can settle with an employee?

Third, what happened to the original safety concern? According to another source,** the NRC found the operators’ actions to be “not prudent” but did not penalize any operators. Did the plant ever take any steps to address the issue to avoid repetition?


* P. Sweet and R. Townsend, KBIA Investigative Report: Looking Into Callaway Nuclear Power Plant’s “Safety Culture” (May 24, 2010).  KBIA is an NPR-member radio station owned by the U. of Missouri school of journalism.

**  FOCUS/Midwest website, "Did Ameren pay a whistleblower to shut up and go away?" (Jan. 4, 2009).

Tuesday, June 8, 2010

Toothpaste and Oil Slicks

At the end of last week came the surprise announcement from the former Dominion engineer, David Collins, that he was withdrawing his allegations regarding his former employer’s safety management and the NRC’s ability to provide effective oversight of safety culture.* The reasons for the withdrawal are still unclear though Collins cited lack of support by local politicians and environmental groups.

What is to be made of this? As we stated in a post at the time of the original allegations, we don’t have any specific insight into the bases for the allegations. We did indicate that how Dominion and the NRC would go about addressing the allegations might present some challenges.

What can be said about the allegations with more certainty is that they will not go away. Like the proverbial toothpaste, allegations can’t be put back into the tube and they will need to be addressed on their merits. We assume that Collins acted in good faith in raising the allegations. In addition, a strong safety culture at Dominion and the NRC should almost welcome the opportunity to evaluate and respond to such matters. A linchpin of any robust safety culture is the encouragement for stakeholders to raise safety concerns and for the organization to respond to them in an open and effective manner. If the allegations turn out to not have merit, it has still been an opportunity for the process to work.

In a somewhat similar vein, the fallout (I am mixing my metaphors) from the oil released into the gulf from the BP spill will remain and have to be dealt with long after the source is capped or shut off. It will serve as an ongoing reminder of the consequences of decisions where safety and business objectives try to occupy a very limited success space. In recent days there have been extensive pieces* in the Wall Street Journal and New York Times delineating in considerable detail the events and decision making leading up to the blowout. These accounts are worthy of reading and digesting by anyone involved in high risk industries. Two things made a particular impression. One, it is clear that the environment leading up to the blowout included fairly significant schedule and cost pressures. What is not clear at this time is to what extent those business pressures contributed to the outcome. There are numerous cited instances where best practices were not followed and concerns or recommendations for prudent actions were brushed aside. One wishes the reporters had pursued this issue in more depth to find out “Why?” Two, the eventual catastrophic outcome was the result of a series of many seemingly less significant decisions and developments. In other words it was a cumulative process that apparently never flashed an unmistakable warning alarm. In this respect it reminds us of the need for safety management to maintain a highly developed “systems” understanding with the ability to connect the dots of risk.

* Links below



Thursday, June 3, 2010

25 Standard Deviation Moves

A Reuters Breakingviews commentary in today’s New York Times makes some interesting arguments about the consequences of the BP oil spill on the energy industry. The commentary draws parallels between BP and the financial implosion that led to Lehman Brothers bankruptcy. ". . . flawed risk management, systemic hazard, and regulatory incompetence" are cited as the common causes, and business models that did not take account of the possibility for "25 standard deviation moves". These factors will inevitably lead to government intervention and industry consolidation as the estimated $27 billion in claims (a current estimate for the BP spill) is ". . . a liability no investor will be comfortable taking, . . ."

While much of this commentary makes sense, we think it is missing a big part of the picture by not focusing on the essential need for much more rigorous safety management. By all reports, the safety performance of BP is a significant outlier in the oil industry; maybe not 25 sigma but 2 or 3 sigma at least. We have posted previously about BP and its safety deficiencies and its apparent inability to learn from past mistakes. There has also been ample analysis of the events leading up to the spill to suggest that a greater commitment to safety could, and likely would, have avoided the blowout. Safety commitment and safety culture provide context, direction and constraints for risk calculations. The potential consequences of a deep sea accident will remain very large, but the probability of the event can and should be brought much lower. Simply configuring energy companies with vastly deep pockets seems unlikely to be a sufficient remedy. For one, money damages are at best an imperfect response to such a disaster. More important, a repeat of this type of event would likely result in a ban on deep sea drilling regardless of the financial resources of the driller.

In the nuclear industry the potentially large consequences of an incident have, so far, been assumed by the government. In this respect there is something of a parallel to the financial crisis where the government stepped in to bail out the "too large to fail" entities. Aside from the obvious lessons of the BP spill, nuclear industry participants have to ensure that their safety commitment is both reality and public perception, or there may be some collateral damage as policy makers think about how high risk industry, including nuclear, liabilities are being apportioned.

Tuesday, June 1, 2010

Underestimating Risk and Cost

Good article in today's New York Times Magazine Preview about economic decision making in general and the oil industry in particular. In summary, when an event is difficult to imagine (e.g., the current BP disaster), people tend to underestimate the probability of it occurring; when it's easier to imagine (e.g., a domestic terrorist attack after 9/11), people tend to overestimate the probability. Now add government caps on liability and decision-making can get really skewed, with unreasonable estimates of both event-related probabilities and costs.

The relevance of this decision-making model to the nuclear industry is obvious but we want to focus on something the article didn't mention: the role of safety culture. Nuclear safety culture guides planning for and reacting to unexpected, negative events. On the planning side, culture can encourage making dispassionate, fact-based decisions regarding unfavorable event probabilities and potential consequences. However, if such an event occurs, then affected personnel will respond consistent with their training and cultural expectations.