Wednesday, April 28, 2010

Safety Culture: Cause or Context (part 2)

In an earlier post, we discussed how “mental models” of safety culture affect perceptions about how safety culture interacts with other organizational factors and what interventions can be taken if safety culture issues arise. We also described two mental models, the Causal Attitude and Engineered Organization. This post describes a different mental model, one that puts greater emphasis on safety culture as a context for organizational action.

Safety Culture as Emergent and Indeterminate

If the High Reliability Organization model is basically optimistic, the Emergent and Indeterminate model is more skeptical, even pessimistic as some authors believe that accidents are unavoidable in complex, closely linked systems. In this view, “the consequences of safety culture cannot be engineered and only probabilistically predicted.” Further, “safety is understood as an elusive, inspirational asymptote, and more often only one of a number of competing organizational objectives.” (p. 356)* Safety culture is not a cause of action, but provides the context in which action occurs. Efforts to exhaustively model (and thus eventually manage) the organization are doomed to failure because the organization is constantly adapting and evolving.

This model sees that the same processes that produce the ordinary and routine stuff of everyday organizational life also produce the messages of impending problems. But the organization’s necessary cognitive processes tend to normalize and homogenize; the organization can’t very well be expected to treat every input as novel or not previously experienced. In addition, distributed work processes and official security policies can limit the information available to individuals. Troublesome information may be buried or discredited. And finally, “Dangers that are neither spectacular, sudden, nor disastrous, or that do not resonate with symbolic fears, can remain ignored and unattended, . . . . “ (p. 357)

We don’t believe safety significant events are inevitable in nuclear organizations but we do believe that the hubris of organizational designers can lead to specific problems, viz., the tendency to ignore data that does not comport with established categories. In our work, we promote a systems approach, based on system dynamics and probabilistic thinking, but we recognize that any mental or physical model of an actual, evolving organization is just that, a model. And the problem with models is that their representation of reality, their “fit,” can change with time. With ongoing attention and effort, the fit may become better but that is a goal, not a guaranteed outcome.

Lessons Learned

What are the takeaways from this review? First, mental models are important. They provide a framework for understanding the world and its information flows, a framework that the holder may believe to be objective but is actually quite subjective and creates biases that can cause the holder to ignore information that doesn’t fit into the model.

Second, the people who are involved in the safety culture discussion do not share a common mental model of safety culture. They form their models with different assumptions, e.g., some think safety culture is a force that can and does affect the vector of organizational behavior, while others believe it is a context that influences, but does not determine, organizational and individual decisions.

Third, safety culture cannot be extracted from its immediate circumstances and examined in isolation. Safety culture always exists in some larger situation, a world of competing goals and significant uncertainty with respect to key factors that determine the organization’s future.

Fourth, there is a risk of over-reliance on surveys to provide some kind of "truth" about an organization’s safety culture, especially if actual experience is judged or minimized to fit the survey results. Since there is already debate about what surveys measure (safety culture or safety climate?), we advise caution.

Finally, in addition to appropriate models and analyses, training, supervision and management, the individual who senses that something is just not right and is supported by an organization that allows, rather than vilifies, alternative interpretations of data is a vital component of the safety system.

* This post draws on Susan S. Silbey, "Taming Prometheus: Talk of Safety and Culture," Annual Review of Sociology, Volume 35, September 2009, pp. 341-369.

Monday, April 26, 2010

The Massey Mess

A postscript to our prior posts re the Massey coal mine explosion two weeks ago. The fallout of the safety issues at Massey mines is reaching a crescendo as even the President of the United States is quoted as stating that the accident was "a failure, first and foremost, of management."

The full article is embedded in this post. It is clear that Massey will be under the spotlight for some time to come with Federal and state investigations being initiated. One wonders if the CEO will or should survive the scrutiny. For us the takeaway from this, and other examples such as Vermont Yankee, is a reminder not to underestimate the potential consequences of safety culture failures. They point directly at the safety management system including management itself. Once that door is opened, the ripple effects can range far downstream and throughout a business.

“Serious Systemic Safety Problem” at BP

Yes, it’s BP in the news again. In case you hadn’t noticed, the fire, explosion and loss of life last week was at an offshore drilling rig working for BP. In the words of an OSHA official BP still has a “serious, systemic safety problem” across the company. Check the embedded web page for the story.

Thursday, April 22, 2010

Classroom in the Sky

The opening of much of the European airspace in the last several days has provided a rare opportunity to observe in real time some dynamics of safety decision making. On the one hand there have been the airlines who have been contending it is safe to resume flights and on the other the regulatory authorities who had been taking a more conservative stance. The question posed in our prior post was to what extent were business pressures influencing the airlines position and to what extent might that pressure, perhaps transmuting into political pressure, influence regulators decisions. Perhaps most importantly, how could one tell?

We have extracted some interesting quotes from recent media reporting of the issue.

Civil aviation officials said their decision to reopen terminals where thousands of weary travelers had camped out was based on science, not on the undeniable pressure put on them by the airlines....’The only priority that we consider is safety. We were trying to assess the safe operating levels for aircraft engines with ash,’ said Eamonn Brennan, chief executive of Irish Aviation Authority. “Pressure to restart flights had been intense.” Despite their protests, the timing of some reopenings seemed dictated by airlines' commercial pressures.

"It's important to realize that we've never experienced in Europe something like this before....We needed the four days of test flights,the empirical data, to put this together and to understand the levels of ash that engines can absorb. Even as airports reopened, a debate swirled about the safety of flying without more extensive analysis of the risks, as it appeared that governments were operating without consistent international guidelines based on solid data. "What's missing is some sort of standard, based on science, that gives an indication of a safe level of volcanic ash..." “Some safety experts said pressure from hard-hit airlines and stranded passengers had prompted regulators to venture into uncharted territory with respect to the ash. In the past, the key was simply to avoid ash plumes.”

How can it be both ways - regulators did not respond to pressure or regulators only acted based on new analysis of data? The answer may lie in our old friend, the theory of normalization of deviation. ref Challenger Launch Decision). As we have discussed in prior posts normalization is a process whereby an organization’s safety standards maybe reinterpreted (to a lower or more accommodating level)over time due to complex interactions of cultural, organizational, regulatory and environmental factors. The fascinating aspects are that this process is not readily apparent to those involved and decisions then made in accordance with the reinterpreted standards are not viewed as deviant or inconsistent with, e.g., “safety is our highest priority”. Thus events which heretofore were viewed as discrepant,no longer are and are thus “normalized”. Aviation authorities believe their decisions are entirely safety-based. Yet to observers outside the organization it very much appears that the bar has been lowered, since what is considered safe today was not yesterday.

Monday, April 19, 2010

The View On The Ground

A brief follow-up to the prior post re situational factors that could be in play in reaching a decision about resuming airline flights in Europe.  Fox News has been polling to assess the views of the public and the results of the poll are provided in the adjacent box.  Note that the overwhelming sentiment is that safety should be the priority.  Also note the wording of the choices, where the “yes” option appears to imply that flights should be resumed based on the “other” priorities such as money and passenger pressure, while the “no” option is based on safety being the priority.  Obviously the wording makes the “yes” option appear to be one where safety may be sacrificed.

So the results are hardly surprising.  But what do the results really mean?  For one it reminds us of the importance of the wording of questions in a survey.  It also illustrates how easy it is to get a large positive response that “safety should be the priority”.  Would the responses have been different if the “yes” option made it clear that airlines believed it was safe to fly and had done test flights to verify?  Does the wording of the “no” option create a false impression that an “all clear” (presumably by regulators) would equate to absolute safety, or at least be arrived at without consideration of other factors such as the need to get air travel resumed? 

Note:  Regulatory authorities in Europe agreed late today to allow limited resumption of air travel starting Tuesday.  Is this an “all clear” or a more nuanced determination that it is safe enough?

The View from 30,000 Feet

The last few days news has been dominated by the ongoing eruption of the volcano in Iceland and its impact on air travel across Europe.  The safety issue is the potential for the ash cloud, at around the 30,000 feet altitude, to seriously damage aircraft jet engines.  Thus the air safety regulators have closed the air space for the last 4 days, creating huge backlogs of passengers and costing airlines $200 million per day.  So we have a firsthand study in the situational dynamics of safety culture.

For the first several days there appeared to be general consensus between the airlines, the safety regulators and politicians that closing the airspace was necessary and prudent.  Sunday the airlines broke ranks and are openly lobbying for resumption of flights.  Several of the major airlines flew test flights to assess the performance of aircraft and now contend it is safe to fly. 

Regulators so far have insisted on keeping the airspace shut down with the earliest resumption possible by Monday evening. 

Are the airlines and the regulators simply reaching different conclusions based on the same information?  Or are airlines feeling the pressures of money and customers and the regulators are not?  Or are the regulators simply being more conservative?  Interestingly there does not appear to have been much overt political pressure to date.  Would you expect regulators to be more sensitive to this source of pressure?  Or immune from it?  How would you know?

I don’t know the answers to these questions but I do think it is unlikely that the situational parameters are not playing some role here.  In fact it seems hard to explain the different points of view without them.  But if it is true, does it necessarily mean that the airlines’ safety cultures are not robust - or is it also possible that the airlines have done exactly what safety culture demands - according safety its appropriate priority but still reaching a decision that flights can be resumed safely?  A robust safety culture does not demand insulation from situational factors, just that they not inappropriately skew the balancing of safety and other business needs.  How exactly one does that in a transparent manner is perhaps the most important indicator of safety culture.

Sunday, April 18, 2010

Safety Culture: Cause or Context (part 1)

As we have mentioned before, we are perplexed that people are still spending time working on safety culture definitions. After all, it’s not because of some definitional issue that problems associated with safety culture arise at nuclear plants. Perhaps one contributing factor to the ongoing discussion is that people hold different views of what the essence of safety culture is, views that are influenced by individuals’ backgrounds, experiences and expectations. Consultants, lawyers, engineers, managers, workers and social scientists can and do have different perceptions of safety culture. Using a term from system dynamics, they have different “mental models.”

Examining these mental models is not an empty semantic exercise; one’s mental model of safety culture determines (a) the degree to which one believes it is measurable, manageable or independent, i.e. separate from other organizational features, (b) whether safety culture is causally related to actions or simply a context for actions, and (c) most importantly, what specific strategies for improving safety performance might work.

To help identify different mental models, we will refer to a 2009 academic article by Susan Silbey,* a sociology professor at MIT. Her article does a good job of reviewing the voluminous safety culture literature and assigning authors and concepts into three main categories: Culture as (a) Causal Attitude, (b) Engineered Organization, and (c) Emergent and Indeterminate. To fit into our blog format, we will greatly summarize her paper, focusing on points that illustrate our notion of different mental models, and publish this analysis in two parts.

Safety Culture as Causal Attitude

In this model, safety culture is a general concept that refers to an organization’s collective values, beliefs, assumptions, and norms, often assessed using survey instruments. Explanations of accidents and incidents that focus on or blame an organization’s safety culture are really saying that the then-existing safety culture somehow caused the negative events to occur or can be linked to the events by some causal chain. (For an example of this approach, refer to the Baker Report on the 2005 BP Texas City refinery accident.)

Adopting this mental model, it follows logically that the corrective action should be to fix the safety culture. We’ve all seen, or been a part of, this – a new management team, more training, different procedures, meetings, closer supervision – all intended to fix something that cannot be seen but is explicitly or implicitly believed to be changeable and to some extent measurable.

This approach can and does work in the short run. Problems can arise in the longer-term as non-safety performance goals demand attention; apparent success in the safety area breeds complacency; or repetitive, monotonous reinforcement becomes less effective, leading to safety culture decay. See our post of March 22, 2010 for a discussion of the decay phenomenon.

Perhaps because this model reinforces the notion that safety culture is an independent organizational characteristic, the model encourages involved parties (plant owners, regulators, the public) to view safety culture with a relatively narrow field of view. Periodic surveys and regulatory observations conclude a plant’s safety culture is satisfactory and everyone who counts accepts that conclusion. But then an event occurs like the recent situation at Vermont Yankee and suddenly people (or at least we) are asking: How can eleven employees at a plant with a good safety culture (as indicated by survey) produce or endorse a report that can mislead reviewers on a topic that can affect public health and safety?

Safety Culture as Engineered Organization

The model is evidenced in the work of the High Reliability Organization (HRO) writers. Their general concept of safety culture appears similar to the Causal Attitude camp but HRO differs in “its explicit articulation of the organizational configuration and practices that should make organizations more reliably safe.” (Silbey, p. 353) It focuses on an organization’s learning culture where “organizational learning takes place through trial and error, supplemented by anticipatory simulations.” Believers are basically optimistic that effective organizational prescriptions for achieving safety goals can be identified, specified and implemented.

This model appears to work best in a command and control organization, i.e., the military. Why? Primarily because a specific military service is characterized by a homogeneous organizational culture, i.e., norms are shared both hierarchically (up and down) and across the service. Frequent personnel transfers at all organizational levels remove people from one situation and reinsert them into another, similar situation. Many of the physical settings are similar – one ship of a certain type and class looks pretty much like another; military bases have a common set of facilities.

In contrast, commercial nuclear plants represent a somewhat different population. Many staff members work more or less permanently at a specific plant and the industry could not have come up with more unique physical plant configurations if it had tried. Perhaps it is not surprising that HRO research, including reviews of nuclear plants, has shown strong cultural homogeneity within individual organizations but lack of a shared culture across organizations.

At its best, the model can instill “processes of collective mindfulness” or “interpretive work directed at weak signals.” At its worst, if everyone sees things alike, an organization can “[drift] toward[s] inertia without consideration that things could be different.” (Weick 1999, quoted in Silbey, p.354) Because HRO is highly dependent on cultural homogeneity, it may be less conscious of growing problems if the organization starts to slowly go off the rails, a la the space shuttle Challenger.

We have seen efforts to implement this model at individual nuclear plants, usually by trying to get everything done “the Navy way.” We have even promoted this view when we talked back in the late 1990s about the benefits of industry consolidation and the best practices that were being implemented by Advanced Nuclear Enterprises (a term Bob coined in 1996). Today, we can see that this model provides a temporary, partial answer but can face challenges in the longer run if it does not constantly adjust to the dynamic nature of safety culture.

Stay tuned for Safety Culture: Cause or Context (part 2).

* Susan S. Silbey, "Taming Prometheus: Talk of Safety and Culture," Annual Review of Sociology, Volume 35, September 2009, pp. 341-369.

Thursday, April 15, 2010

Safety Culture Assessments and Surveys

It appears that the safety culture at Vermont Yankee continues to interest the NRC, given the discussion during the recent public meeting related to VY's safety culture survey. We will soon be posting our additional thoughts on the nature of safety culture but for some background material, we suggest our posts of August 17, 2009 (Safety Culture Assessment), August 24, 2009 (Assessment Results) and August 26, 2009 (Can Assessments Identify/Breed Complacency). You might want to check these out in light of the attention safety culture surveys are receiving.

Tuesday, April 13, 2010

Vermont Yankee (part 5) - Muddy Water

In our April 5, 2010 post re Vermont Yankee we provided some initial thoughts on the report of the independent investigator regarding misleading statements provided by Entergy personnel to Vermont regulators, as contained in Entergy’s March 31, 2010 response to a March 1, 2010 NRC Demand for Information.* The Entergy filing also provides more detail on follow-up actions including an assessment of current site safety culture. In this post, we offer some additional observations and questions.
First, in our initial March 3, 2010 post regarding the VY situation, we disputed a prediction made by a third party that the administrative actions taken by Entergy for certain employees might have a detrimental effect on the safety culture at the plant - due to the way Entergy is treating its employees. In reality it appeared to us that any detrimental impact on safety culture would be more likely if Entergy had not taken appropriate actions. In Entergy’s report to the NRC, they provide the results of a follow-up assessment confirming that after the personnel actions employees were even more likely to raise concerns.
Also in our initial post we speculated that the Vermont Yankee events could have consequences for Entergy’s proposed spinout of six nuclear plants into a separate subsidiary. Since then Entergy has announced the cancellation of the spinout after a decision by New York re the extension of permits for their Indian Point plants.
However, after a careful review of the March 31 Entergy response, we are still left with water that is more than a little bit muddy. Entergy says a Synergy assessment a few months before the reporting event found safety culture at Vermont Yankee to be strong. After the event, Entergy states safety culture is strong or stronger, and with regard to the replaced staff, Entergy “continues to have confidence in the integrity of the affected employees.” Strong safety culture and organizational integrity are not supposed to add up to this kind of outcome. How then did things go wrong? How did the misleading statements to Vermont regulators come about and what was the cause?
A fundamental element of all nuclear plant problem resolution/corrective action programs is a determination of not just what happened, but why. Cause in other words, and in significant situations, the root cause. The root cause that led to eleven employees, including managers and site executives, being relieved of duties and disciplined is not contained in the Entergy materials. In fact, most of the focus appears to be on the safety culture of the plant staff both before and after the incident came to light and the personnel actions taken. Those actions and information appear to be reassuring in regards to the plant staff - but the plant staff was not where the problem occurred. There is also considerable emphasis on the fact that the managers have been replaced with competent substitutes. But haven’t those new managers been placed in exactly the same situation as the former managers were in? If it is not clear why the former managers failed to meet performance standards, then how is one confident that the replacements will do so?
As we have pointed out in other posts, the response to safety culture failures too often stresses the “values and beliefs” of personnel as the beginning and end of safety culture. We have argued that the situational parameters, including competing goals and interests, are at least as important if not paramount in trying to understand such issues.
What was the situation at Vermont Yankee and to what extent, if any, did it have an effect? The VY management team was operating in an environment where significant business decisions were in play. One was the extension of the operating license for VY which required approval by both the NRC and by the Vermont Senate. A second was the pending proposed spinout of nuclear units, including VY, into a separate subsidiary, a spinout that was expected to be worth billions of dollars to Entergy. SEC and other regulatory filings had been made by Entergy for the spinout and approvals were being sought from state regulators and the NRC.
Entergy’s March 31 NRC submittal also states, “Finally, neither the underlying report of investigation which led to the discipline, nor the interviews of the AFEs, identified any credible evidence to suggest that any weakness in the work environment or site safety culture contributed to a reluctance by anyone to provide clarifying or supplemental information to the relevant state officials. Indeed, there is no credible evidence that any of the AFEs are -- or were -- reluctant to report safety concerns or any other matter of potential regulatory significance or legal non-compliance.”
Does this mean that situational factors such as business priorities were evaluated and found not to be a contributor? If so, how was this done and what is the basis for such a conclusion? Or were such competing priorities acknowledged as potential influences and able to be dealt with as part of the management system? What other situational factors might have been present and to what effect?

*ADAMS Accession Number ML100910420
**ADAMS Accession Number ML100990409

Friday, April 9, 2010

“Safety is Job One” at Massey

Non-fatal days lost (NFDL) rates are the benchmark used by the coal industry to measure safety. And the industry average is 3.31. (Imagine the NRC's ROP including an indicator like NFDL.) Violations (cited by the Mine Safety and Health Administration) are also an indicator for mine safety. But according to Massey CEO Blankenship, “Violations are unfortunately a normal part of the mining process.”* And “We don’t pay much attention to the violation count.”**

Massey’s commitment to safety came under scrutiny back in 2005 after Mr. Blankenship sent a memorandum to his deep mine superintendents stating:

What do you think was the takeaway by the organization as a result of the two memos?

Massey is an easy target at the moment and we are not using these quotes to pile onto the outrage associated with the recent mine explosion. What is obvious is that the avowals by Massey that “Safety is Job One” are meaningless in the face of the actual behavior of the corporation. This was the point in our March 12, 2010 post re BP and their refinery safety issues. A very real problem is that virtually everyone engaged in complex and risky industrial activities makes the same safety pronouncements whether or not they live by them. Thus, the pronouncements are robbed of any real significance or value - not just to those who disregard them, but to all. It sounds a lot like stuff that politicians say and which no one believes, because they all say it and none of them means it.

So our takeaway is a caution to nuclear organizations not to reflexively broadcast and re-emphasize their commitment to safety as a response or correction to an identified safety culture problem. Or at least any such emphasis needs to be in a context coupled to specific actions that actually sustain and reflect that commitment. As we comment regularly in this blog, we view safety culture as a dynamic system and one aspect of that system is the interplay of management reinforcement and organizational trust. Reinforcement of safety priority tends to be the focus of a lot of communications and training, reasserting values and beliefs, etc. while trust tends to be determined by people’s perceptions of actual decisions and actions. When reinforcement and actions are congruent, trust is elevated. When management says one thing but acts in ways that are inconsistent, or appear inconsistent, trust evaporates and the attempt at reinforcement may make things worse.

* “Deaths at West Virginia Mine Raise Issues About Safety,” NY Times (April 6, 2010).
** “Massey’s Long History of Coal Mine Violations," The Energy Source blog
at (April 6, 2010).

Tuesday, April 6, 2010

Safety Culture as Competitive Capability

A recent McKinsey survey describes companies' desire to use training to build competitive capabilities however, most training actually goes toward lower-priority areas more aligned with the organizations' culture. For example, a company should be focusing its training on developing project management but instead focuses on pricing because price leadership is viewed as an important component of company culture.

This caused us to wonder: How many nuclear managers believe their plant's safety culture is a competitive capability and where is safety culture on the training priority list? We believe that safety culture is actually a competitive asset of nuclear organizations in that safety performance is inextricably linked to overall performance. But how many resources are allocated to safety culture training? How is training effectiveness measured? We fear the traditional tools for such training may not be that effective in actually moving the culture dial, thus not yield measurable competitive benefit.

Hope exists. One unsurprising survey conclusion is "When senior leaders set the agenda for building capabilities, those agendas are more often aligned with the capability most important to performance." (p. 7) The challenge is to get senior nuclear managers to recognize and act on the importance of safety culture training.

Monday, April 5, 2010

Huh? aka Vermont Yankee (part 4)

On March 31, 2010, Entergy transmitted to the NRC key findings of the Morgan, Lewis & Bockius LLP investigation of misstatements by Entergy employees at Vermont Yankee.* The investigation concluded that no Entergy employees "intentionally misled" Vermont regulators and "The investigation also concluded that no one made any intentionally false statements in state regulatory proceedings." That’s all fairly clear.

But then the same paragraph continues: “The report found, however, that certain ENVY [Entergy Nuclear Vermont Yankee ] personnel did not clarify certain understandings and assumptions, which resulted in misunderstandings, when viewed in a context different from the one understood to be relevant to the CRA [Comprehensive Reliability Assessment]."

I was fine up to the “however”. I just don’t understand the law firm’s tortured phrasing of what did happen. Will anyone else?

*ADAMS Accession Number ML100910420

Friday, April 2, 2010

NRC Briefing on Safety Culture - March 30, 2010

It would be difficult to come up with an attention-grabbing headline for the March 30 Commission briefing on safety culture. Not much happened. There were a lot of high fives for the perceived success of the staff’s February workshop and its main product, a strawman definition of nuclear safety culture. The only provocative remarks came from a couple of outside the mainstream “stakeholders”, the union rep for the NRC employees (and this was really limited to perceptions of internal NRC safety culture) and long time nuclear gadfly, Bille Garde (commended by Commissioner Svinicki for her consistency of position on safety culture spanning the last 20 years). Otherwise the discussions were heavily process oriented with very light questioning by the two currently seated Commissioners.

The main thrust of the briefing was on the definition of safety culture that was produced in the workshop. That strawman is different than that proposed by the NRC staff, or for that matter those used by other nuclear organizations such as INPO and INSAG. The workshop process sounded much more open and collegial than recent legislative processes on Capitol Hill.

Perhaps the one quote of the session that yields some insight as to where the Commission may be headed was from Chairman Jaczko; his comments can be viewed in the video below. Later in the briefing the staff demurred on endorsing the workshop product (versus the original staff proposal) pending additional input from internal and external sources.