Tuesday, May 10, 2011

Shifting the Burden

Pitot tube
This post emanates from the ongoing investigations of the crash of Air France flight 447 from Rio de Janeiro to Paris.  In some respects it is a follow-up to our January 27, 2011 post on Air France’s safety culture.  An article in the New York Times Sunday Magazine* explores some of the mysteries surrounding the loss of the plane in mid-Atlantic.  One of the possible theories for the crash involves the pitot tubes used on the Airbus plane.  Pitot tubes are instruments used on aircraft to measure air speed.  The pitot tube measures the difference between total (stagnation) and static pressure to determine dynamic pressure and therefore velocity of the air stream.  Care must be taken to assure that the pitot tubes do not become clogged with ice or other foreign matter as it would interrupt or corrupt the airspeed signal provided to the pilots and the auto-pilot system. 

On the flight 447 aircraft, three Thales AA model pitot tubes were in use.  They are produced by a French company and cost approximately $3500 each.  The Times article goes on to explain:

"...by the summer of 2009, the problem of icing on the Thales AA was known to be especially common….Between 2003 and 2008, there were at least 17 cases in which the Thales AA had problems on the Airbus A330 and its sister plane, the A340.  In September 2007, Airbus issued a ‘service bulletin’ suggesting that airlines replace the AA pitots with a newer model, the BA, which was said to work better in ice.”

Air France’s response to the service bulletin established a policy to replace the AA tubes “only when a failure occurred”.  A year later Air France then asked Airbus for “proof” that the model BA tubes worked better in ice.  It took Airbus another 6-7 months to perform tests that demonstrated the superior performance of the BA tubes, following which Air France proceeded with implementing the recommended change for its A330 aircraft.  Unfortunately the new probes had not yet been installed at the time of flight 447.

Much is still unknown about whether in fact the pitot tubes played a role in the crash of flight 447 and of the details of Air France’s consideration of deploying replacements.  But there is a sufficient framework to pose some interesting questions regarding how safety considerations were balanced in the process, and what might be inferred about the Air France safety culture.  Most clearly it highlights how fundamental the decision making process is to safety culture.

What is clear is that Air France’s approach to this problem “shifted the burden” from assuring that something was safe to proving that it was unsafe.  In legal usage this involves transferring the obligation to prove a fact in controversy from one party to another.  Or in systems thinking (which you may have noticed we strongly espouse) it denotes a classic dynamic archetype - a problem arises, it can be ameliorated through either a short term, symptom based response or a fundamental solution that may take additional time and/or resources to implement.  Choosing the short term fix provides relief and reinforces the belief in the efficacy of the response.  Meanwhile the underlying problem goes unaddressed.  For Air France, the service bulletin created a problem.  Air France could have immediately replaced the pitot tubes or undertaken its own assessment of pitot tubes with replacement to follow.  This would have taken time and resources.  Nor did Air France appear to try to address the threshold question of whether the existing AA model instruments were adequate - in nuclear industry terms, were they “operable” and able to perform their safety function?  Air France apparently did not even implement interim measures such as retraining to improve pilot’s recognition and response to pitot tube failures or incorrect readings.  Instead, Air France shifted the burden back to Airbus to “prove” their recommendation.  The difference between showing that something is not safe versus that it is safe is as wide as, well, the Atlantic Ocean.

What we find particularly interesting about shifting the burden is that it is just another side of the complacency coin.  Most people engaged in safety culture science recognize that complacency is a potential contributor to the decay and loss of effectiveness of safety culture.  Everything appears to be going OK so there is less need to pursue issues, particularly those lacking safety impact clarity.  Not pursuing root causes, not verifying corrective action efficacy, loss of questioning attitude and lack of resources could all be telltale signs of complacency.  The interesting thing about shifting the burden is that it yields much the same result - but with the appearance that action is being taken. 

The footnote to the story is the response of Air Caraibes to similar circumstances in this time frame.  The Times article indicates Air Caraibes experienced two “near misses” with Thales AA pitot tubes on A330 aircraft.  They immediately replaced the parts and notified regulators.

*  W.S. Hylton, "What Happened to Air France Flight 447?" New York Times Magazine (May 4, 2011).

Sunday, April 10, 2011

On the Other Hand

Our prior post on the award of safety performance bonuses at Transocean may have left you, and us, wondering about the ability of large corporations to walk the talk.  Well, better news today with an article from the Wall Street Journal* recounting the decision by Southwest Airlines to preemptively ground its 737s after a fuselage tear on one of the planes.  

As told in the article, the Southwest management appears to have rapidly responded to the event (over a weekend) with technical assessment including advice from Boeing.  The bottom line on the technical side was uncertainty regarding the cause of the failure and the implications for other similar 737s.  It was also clear that Southwest placed the burden on an affirmative showing that the planes were safe rather than requiring evidence that they weren’t.  With the issue “up in the air” the CEO acted quickly and decisively with the grounding order and the conduct of inspections as recommended by Boeing.  

The decision resulted in the cancellation of over 600 flights and no doubt inconvenienced many Southwest passengers, and will have a substantial cost impact to the airline.  The action by Southwest was described as “unusual” as it did not wait for a directive from the government or Boeing to remove planes from service. 

(Ed. note:  Southwest’s current approach is even more remarkable in light of how recently their practices were not exactly on the side of the angels.  In 2008, the FAA fined Southwest $7.5 million for knowingly flying planes that were overdue for mandatory structural inspections.)

*  T.W. Martin, A. Pasztor and P. Sanders, "Southwest's Solo Flight in Crisis," wsj.com (Apr 8, 2011).

Thursday, April 7, 2011


“...notwithstanding the tragic loss of life in the Gulf of Mexico, we [Transocean] achieved an exemplary statistical safety record as measured by our total recordable incident rate (‘‘TRIR’’) and total potential severity rate (‘‘TPSR’’).  As measured by these standards, we recorded the best year in safety performance in our Company’s history, which is a reflection on our commitment to achieving an incident free environment, all the time, everywhere.”*

Good grief.  Did Transocean really say this?  Eleven people including nine Transocean employees died in the Deepwater Horizon oil rig explosion.  The quote is from Transocean’s 2010 Annual Report and Proxy recently filed with the SEC.  It provides another illuminating example where the structure and award of management incentives speak much greater volumes than corporate safety rubrics.  (For our report on compensation structures within nuclear power companies and the extent to which such compensation included incentives other than safety, look here and here.)  Or as the saying goes, “Follow the money”.

To fully comprehend how Transocean’s incentive program purports to encourage safety performance we are providing the following additional quotes from its Annual Report.

“Safety Performance.  Our business involves numerous operating hazards and we remain committed to protecting our employees, our property and the environment. Our ultimate goal is expressed in our Safety Vision of ‘‘an incident-free workplace—all the time, everywhere…..

"The [Compensation] Committee measures our safety performance through a combination of our total recordable incident rate (‘‘TRIR’’) and total potential severity rate (‘‘TPSR’’).

•    "TRIR is an industry standard measure of safety performance that is used to measure the frequency of a company’s recordable incidents and comprised 50% of the overall safety metric. TRIR is measured in number of recordable incidents per 200,000 employee hours worked.

•    "TPSR is a proprietary safety measure that we use to monitor the total potential severity of incidents and comprised 50% of the overall safety metric. Each incident is reviewed and assigned a number based on the impact that such incident could have had on our employees and contractors, and the total is then combined to determine the TPSR.

"The occurrence of a fatality may override the safety performance measure.

"….Based on the foregoing safety performance measures, the actual TRIR was 0.74 and the TPSR was 35.4 for 2010. These outcomes together resulted in a calculated payout percentage of 115% for the safety performance measure for 2010. However, due to the fatalities that occurred in 2010, the Committee exercised its discretionary authority to modify the TRIR payout component to zero, which resulted in a modified payout percentage of 67.4% for the safety performance measure." (p. 45)
The treatment of bonuses for Transocean execs was picked up in various media outlets and met with, shall we say, skepticism.  Transocean responded to the blowback with the following:

“We acknowledge that some of the wording in our 2010 proxy statement may have been insensitive in light of the incident that claimed the lives of eleven exceptional men last year and we deeply regret any pain that it may have caused...” **

Note that the apology is directed at the “wording” of the proxy, not to the actual award of bonus compensation for safety performance.  We are tempted here to make some reference to “density” but it is self-evident.

Perhaps realizing that something more would be appropriate, Transocean announced yesterday that members of the senior management team would be donating their bonuses to the Deepwater Horizon Memorial Fund.*** 

Oops, actually they will be donating just the “safety portion” of their bonuses to the fund.  All other bonus amounts and incentive awards are not affected and the Transocean incentive structure for safety performance remains unchanged for 2011.

***  Announcement by Transocean Ltd. Senior Management Team, Zug, Switzerland (Apr 5, 2011 MARKETWIRE via COMTEX).

Monday, April 4, 2011

Combustible Gas

As we observed in our prior blog post, the publication by the Union of Concerned Scientists of their new study of nuclear near misses would likely generate a combustible gas that could find some ignition sources, at least among like-minded nuclear critics.  Thus the March 22, 2011 article* in The Nation magazine was predictable, including the comments by Henry Meyers that the UCS study is evidence of a lack of “serious oversight for twenty years” by the NRC.  Evidence of this includes the reduction in NRC violations and fines in the late 1990s and the contention that then-Chairman Dr. Shirley Jackson caved to political pressure.  Disregarded are the facts that many nuclear plants underwent enormous performance improvement programs in that period and the consolidation of nuclear ownership under a small number of advanced nuclear enterprises.**  These nuclear operators had the significant management, technical  and financial resources to ensure operating excellence in their plants, resulting in much better regulatory compliance.

But it would be a mistake to dismiss lightly the direction that UCS and Christian Parenti of The Nation are taking the post-Fukushima discussion of nuclear safety.  Their thesis is that the current risky state of the nuclear industry in the U.S. (“a fleet of old nuclear plants and the 40,000 tons of nuclear waste they have created”) is due to the lack of strong safety culture, and that the NRC has been compromised through political pressure and the corrosive influence of an inadequate industry safety culture.  Thus,

“...it is imperative to overhaul the inadequate, industry-dominated safety culture that has developed over the past twenty years.  This eroded safety culture is a source of serious danger—and it must be fixed.”

Approaching the current state of nuclear safety from this direction has the potential to open a Davis-Besse size hole in the carefully constructed safety record of the nuclear industry.  By its essence safety culture is perhaps the most far ranging indictment of safety; far more extensive than any specific technical issues that have historically been the target of nuclear critics.  It targets an unprotected flank of both the industry and the NRC; including the recent process where consensus and stakeholder involvement has been emphasized by the NRC to the point that the above quote will gain traction.  The product, a safety culture policy statement by the NRC, something that is not even enforceable, will be framed as a continuation of a lack of “serious oversight” and serve well the newly energized anti-nuclear community. 

*  C. Parenti, "After Three Mile Island: The Rise and Fall of Nuclear Safety Culture," The Nation (Mar 22, 2011).

**  Nuclear industry consolidation was predicted and described in a paper I co-authored with NYPA's Bob Schoenberger, "Capturing Stranded Value in Nuclear Plant Assets," The Electricity Journal 9 (June 1996): 59-65.

Monday, March 21, 2011

Never Let a Good Crisis Go To Waste

“You don’t ever want a crisis to go to waste; it’s an opportunity to do important things that you would otherwise avoid.” So said Rahm Emanuel, memorably, several years ago.  Perhaps taking a page from the Emanuel book, the Union of Concerned Scientists took the opportunity last Thursday to release a report chronicling a series of problems it had investigated at U.S. nuclear plants.*  Apparently the events in Japan pumped plenty of fresh oxygen into the UCS war room in time for them to trot out their latest list of concerns regarding nuclear plant safety.

[UCS senior scientist Edwin] “Lyman was speaking in a conference call with reporters on the release of a report examining critical problems — known as “near misses” — at various nuclear facilities in the United States last year, and the N.R.C.’s handling of critical problems”

David Lochbaum, the author of the report and the director of the nuclear safety program for the organization, was quoted as:

[The report] “also suggested that federal regulators needed to do more to investigate why problems existed in the first place — including examining the overall safety culture of companies that operate nuclear power plants — rather than simply order them to be fixed.”

It could be that the UCS is aiming at the heart of the recent discussions surrounding the NRC’s new policy statement on safety culture.  It is clear that the NRC has little appetite to regulate the safety culture of its licensees; instead urging licensees to maintain a strong safety culture and and taking action only if “results” are not acceptable.  UCS would like specific issues, such as the “near misses” in their report, to be broadly interpreted to establish a more fundamental, cultural flaw in the enterprise itself.

Perhaps the larger question raised by the events in Japan is the dominance of natural phenomena in challenging man-made structures, and whether safety culture provides any insulation.  While the earthquake itself seemed fairly well contained at the nuclear plants, the tsunami easily over powered the sea wall at the facility and caused widespread disability of crucial plant systems.  Does this sound familiar?  Does it remind one of a Category 5 hurricane sweeping aside the levees in New Orleans?  Or the overwhelming forces of an oil well blowout brushing aside the isolation capability of a blowout preventer? 

John McPhee’s 1990 book The Control of Nature chronicles a number of instances of man’s struggle against nature - in his view, one that is inevitably bound to fail.  Often the very acts undertaken to “control nature” contribute to future failures of that control.  McPhee cites the leveeing of the Mississippi, leading to faster channel flows, more silting, more leveeing, and ultimately the kind of macro disaster occurring in Katrina.  Or the “debris bins” built in the canyons above Los Angeles communities.  The bins fill over successive storms, eventually leading to failures of the bins themselves and catastrophic mud and debris floods in the downstream valleys.

It is probably inevitable that in the aftermath of Japan there will be calls to up the design criteria of nuclear plants to higher levels of earthquakes and other natural phenomena.  The expectation will be that this will provide the absolute protection desired by the public or groups such as UCS.  Until of course the next storm or earthquake that is incrementally larger, or in a worse location or in combination with some other event, that supersedes the more stringent assumptions.

Safety culture cannot deliver on an expectation that safety is absolute or without limits. It can and should emphasize the priority and unflagging attention to safety that maximizes the capacity of a facility and its staff to withstand unforeseen challenges .   We know that the Japan event proves the former.  It will be equally important to determine if it also showed the latter.   

*  T.Zeller Jr., "Citing Near Misses, Report Faults Both Nuclear Regulators and Operators," New York Times, Green: A Blog About Energy and the Environment (Mar 17, 2011, 1:50 PM)

Friday, March 11, 2011

Safety Culture Performance Indicators

In our recent post on safety culture management in the DOE complex, we concentrated on documents created by the DOE team.  But there was also some good material in the references assembled by the team.  For example, we saw some interesting thoughts on performance indicators in a paper by Andrew Hopkins, a sociology professor at The Australian National University.*  Although the paper was prepared for an oil and gas industry conference, the focus on overall process safety has parallels with nuclear power production.

Contrary to the view of many safety culture pundits, including ourselves, Professor Hopkins is not particularly interested in separating lagging from leading indicators; he says that trying to separate them may not be a useful exercise.  Instead, he is interested in a company’s efforts to develop a set of useful indicators that in total measure or reflect the state of the organization’s risk control system.  In his words, “. . . the important thing is to identify measures of how well the process safety controls are functioning.  Whether we call them lead or lag indicators is a secondary matter.  Companies I have studied that are actively seeking to identify indicators of process safety do not make use of the lead/lag distinction in any systematic way. They use indicators of failure in use, when these are available, as well as indicators arising out their own safety management activities, where appropriate, without thought as to whether they be lead or lag. . . . Improving performance in relation to these indicators must enhance process safety. [emphasis added]” (p. 11)

Are his observations useful for people trying to evaluate the overall health of a nuclear organization’s safety culture?  Possibly.  Organizations use a multitude of safety culture assessment techniques including (but not limited to) interviews; observations; surveys; assessments of the CAP and other administrative processes, and management metrics such as maintenance performance, all believed to be correlated to safety culture.  Maybe it would be OK to dial back our concern with identifying which of them are leading (if any) and which are lagging.  More importantly, perhaps we should be asking how confident we are that an improvement in any one of them implies that the overall safety culture is in better shape. 

*  A. Hopkins, "Thinking About Process Safety Indicators," Working Paper 53, National Research Centre for OHS Regulation, Australian National University (May 2007).  We have referred to Professor Hopkins’ work before (here and here).

Monday, March 7, 2011

Culture Wars

We wanted to bring to our readers attention an article from the McKinsey Quarterly (March 2011) that highlights the ability of management simulators to be powerful business tools.  The context is the use of such “war games” in assisting management teams to accomplish their business goals; but we would allow that their utility extends to other challenges such as managing safety culture.

“Well-designed war games, though not a panacea, can be powerful learning experiences that allow managers to make better decisions.”

“...the company designed a game to answer the more strategic question: how can we win market share given the budget pressures on the Department of Defense and the moves of competitors? The game tested levers such as pricing, contracting, operational improvements, and partnerships.  The outcome wasn’t a tactical playbook—a list of things to execute and monitor—but rather strategic guidance on the industry’s direction, the most promising types of moves, the company’s competitive strengths and weaknesses, and where to focus further analysis.” (p. 3)  We have often used the term “levers” to bring attention to the need for managers to understand when and how to take actions to bring about a desired safety culture result.  Levers connote control and, as with any control system, control must be based on an understanding of the system’s dynamics.  Importantly the above quote distinguishes the outcome of the simulated experience is not a “playbook”, but “guidance” (we would add a deeper understanding and developed skills) that can be applied in the real world.

Interestingly the article mentions the use of games to facilitate or achieve organizational alignment around a strategic decision.  This treads very close to our contention that using a safety culture simulator offers a powerful environment within which managers can interact including developing common mental models and understanding of culture dynamics.  As noted in the article, “This shared experience...has continued to stimulate discussions across the company…” (p. 4)  What could be more valuable for reinforcing safety culture than informed and broad based discussion within the organization?  As Horn says, “It’s often beneficial, however, to repeat a game for the sake of organizational alignment ... usually, the wider group of employees who will implement the decision. Most people learn better by doing, and when they have shared experiences, they are more likely to embrace change.”

Thursday, March 3, 2011

Safety Culture in the DOE Complex

This post reviews a Department of Energy (DOE) effort to provide safety culture assessment and improvement tools for its own operations and those of its contractors.


The DOE is responsible for a vast array of organizations that work on DOE’s programs.  These organizations range from very small to huge in size and include private contractors, government facilities, specialty shops, niche manufacturers, labs and factories.  Many are engaged in high-hazard activities (including nuclear) so DOE is interested in promoting an effective safety culture across the complex.

To that end, a task team* was established in 2007 “to identify a consensus set of safety culture principles, along with implementation practices that could be used by DOE . . .  and their contractors. . . . The goal of this effort was to achieve an improved safety culture through ISMS [Integrated Safety Management System] continuous improvement, building on operating experience from similar industries, such as the domestic and international commercial nuclear and chemical industries.”  (Final Report**, p. 2)

It appears the team performed most of its research during 2008, conducted a pilot program in 2009 and published its final report in 2010.  Research included reviewing the space shuttle and Texas City disasters, the Davis-Besse incident, works by gurus such as James Reason, and guidance and practices published by NASA, NRC, IAEA, INPO and OSHA.

Major Results

The team developed a definition of safety culture and described a process whereby using organizations could assess their safety culture and, if necessary, take steps to improve it.

The team’s definition of safety culture:

“An organization’s values and behaviors modeled by its leaders and internalized by its members, which serve to make safe performance of work the overriding priority to protect the workers, public, and the environment.” (Final Report, p. 5)

After presenting this definition, the report goes on to say “The Team believes that voluntary, proactive pursuit of excellence is preferable to regulatory approaches to address safety culture because it is difficult to regulate values and behaviors. DOE is not currently considering regulation or requirements relative to safety culture.” (Final Report, pp. 5-6)

The team identified three focus areas that were judged to have the most impact on improving safety and production performance within the DOE complex: Leadership, Employee/Worker Engagement, and Organizational Learning. For each of these three focus areas, the team identified related attributes.

The overall process for a using organization is to review the focus areas and attributes, assess the current safety culture, select and use appropriate improvement tools, and reinforce results. 

The list of tools to assess safety culture includes direct observations, causal factors analysis (CFA), surveys, interviews, review of key processes, performance indicators, Voluntary Protection Program (VPP) assessments, stream analysis and Human Performance Improvement (HPI) assessments.***  The Final Report also mentioned performance metrics and workshops. (Final Report, p. 9)

Tools to improve safety culture include senior management commitment, clear expectations, ISMS training, managers spending time in the field, coaching and mentoring, Behavior Based Safety (BBS), VPP, Six Sigma, the problem identification process, and HPI.****  The Final Report also mentioned High Reliability Organization (HRO), Safety Conscious Work Environment (SCWE) and Differing Professional Opinion (DPO). (Final Report, p. 9)  Whew.

The results of a one-year pilot program at multiple contractors were evaluated and the lessons learned were incorporated in the final report.

Our Assessment

Given the diversity of the DOE complex, it’s obvious that no “one size fits all” approach is likely to be effective.  But it’s not clear that what the team has provided will be all that effective either.  The team’s product is really a collection of concepts and tools culled from the work of outsiders, combined with DOE’s existing management programs, and repackaged as a combination of overall process and laundry lists.  Users are left to determine for themselves exactly which sub-set of tools might be useful in their individual situations.

It’s not that the report is bad.  For example, the general discussion of safety culture improvement emphasizes the importance of creating a learning organization focused on continuous improvement.  In addition, a major point they got right was recognizing that safety can contribute to better mission performance.  “The strong correlation between good safety performance with good mission performance (or productivity or reliability) has been observed in many different contexts, including industrial, chemical, and nuclear operations.” (Final Report, p. 20)

On the other hand, the team has adopted the works of others but does not appear to recognize how, in a systems sense, safety culture is interwoven into the fabric of an organization.  For example, feedback loops from the multitude of possible interventions to overall safety culture are not even mentioned.  And this is not a trivial issue.  An intervention can provide an initial boost to safety culture but then safety culture may start to decay because of saturation effects, especially if the organization is hit with one intervention after another.

In addition, some of the major, omnipresent threats to safety culture do not get the emphasis they deserve.  Goal conflict, normalization of deviance and institutional complacency are included in a list of issues from the Columbia, Davis-Besse and Texas City events (Final Report, p. 13-15) but the authors do not give them the overarching importance they merit.  Goal conflict, often expressed as safety vs mission, should obviously be avoided but its insidiousness is not adequately recognized; the other two factors are treated in a similar manner. 

Two final picky points:  First, the report says it’s difficult to regulate behavior.  That’s true but companies and government do it all the time.  DOE could definitely promulgate a behavior-based safety culture regulatory requirement if it chose to do so.  Second, the final report (p. 9) mentions leading (vs lagging) indicators as part of assessment but the guidelines do not provide any examples.  If someone has some useful leading indicators, we’d definitely like to know about them. 

Bottom line, the DOE effort draws from many sources and probably represents consensus building among stakeholders on an epic scale.  However, the team provides no new insights into safety culture and, in fact, may not be taking advantage of the state of the art in our understanding of how safety culture interacts with other organizational attributes. 

*  Energy Facility Contractors Group (EFCOG)/DOE Integrated Safety Management System (ISMS) Safety Culture Task Team.

**  J. McDonald, P. Worthington, N. Barker, G. Podonsky, “EFCOG/DOE ISMS Safety Culture Task Team Final Report”  (Jun 4, 2010).

***  EFCOG/DOE ISMS Safety Culture Task Team, “Assessing Safety Culture in DOE Facilities,” EFCOG meeting handout (Jan 23, 2009).

****  EFCOG/DOE ISMS Safety Culture Task Team, “Activities to Improve Safety Culture in DOE Facilities,” EFCOG meeting handout (Jan 23, 2009).