Thursday, May 26, 2011

Upper Big Branch 1

A few days ago the Governor’s Independent Investigation Panel issued its report on the Upper Big Branch coal mine explosion of April 5, 2010.  The report is over 100 pages and contains considerable detail on the events and circumstances leading up to the disaster, coal mining technology and safety issues.  It is well worth reading for anyone in the business of assuring safety in a complex and high risk enterprise.  We anticipate doing several blog posts on material from the report but wanted to start with a brief quote from the forward to the report, summarizing its main conclusions.

“A genuine commitment to safety means not just examining miners’ work practices and behaviors.  It means evaluating management decisions up the chain of command - all the way to the boardroom - about how miners’ work is organized and performed.”*

We believe this conclusion is very much on the mark for safety management and for the safety culture that supports it in a well managed organization.  It highlights what to us has appeared to be an over-emphasis in the nuclear industry on worker practices and behaviors - and “values”.   And it focuses attention on management decisions - decisions that maintain an appropriate weight to safety in a world of competing priorities and interests - as the sine qua non of safety.  As we have discussed in many of our posts, we are concerned with the emphasis by the nuclear industry on safety culture surveys and training in safety culture principles and values as the primary tools of assuring a strong safety culture.  Rarely do culture assessments focus on the decisions that underlie the management of safety to examine the context and influence of factors such as impacts on operations, availability of resources, personnel incentives and advancement, corporate initiatives and goals, and outside factors such as political pressure.  The Upper Big Branch report delves into these issues and builds a compelling basis for the above conclusion, a conclusion that is not limited to the coal industry.


*  Governor’s Independent Investigation Panel, “Report to the Governor: Upper Big Branch,” National Technology Transfer Center, Wheeling Jesuit University (May 2011), p. 4.

Thursday, May 19, 2011

Mental Models and Learning

A recent New York Times article on teaching methods* caught our eye.  It reported an experiment by college physics professors to improve their freshmen students’ understanding and retention of introductory material.  The students comprised two large (260+) classes that usually were taught via lectures.  For one week, teaching assistants used a collaborative, team-oriented approach for one of the classes.  Afterward, this group scored higher on the test than the group that received the traditional lecture.  

One of the instructors reported, “. . . this class actively engages students and allows them time to synthesize new information and incorporate it into a mental model . . . . When they can incorporate things into a mental model, we find much better retention.”

We are big believers in mental models, those representations of the world that people create in their minds to make sense of information and experience.  They are a key component of our system dynamics approach to understanding and modeling safety culture.  Our NuclearSafetySim model illustrates how safety culture interacts with other variables in organizational decision-making; a primary purpose for this computer model is to create a realistic mental model in users’ minds.

Because this experiment helped the students form more useful mental models, our reaction to it is generally favorable.  On the other hand, why is the researchers’ “insight” even news?  Why wouldn’t a more engaging approach lead to a better understanding of any subject?  Don’t most of you develop a better understanding when you do the lab work, code your own programs, write the reports you sign, or practice decision-making in a simulated environment?

*  B. Carey, “Less Talk, More Action: Improving Science Learning,” New York Times (May 12, 2011).

Tuesday, May 10, 2011

Shifting the Burden

Pitot tube
This post emanates from the ongoing investigations of the crash of Air France flight 447 from Rio de Janeiro to Paris.  In some respects it is a follow-up to our January 27, 2011 post on Air France’s safety culture.  An article in the New York Times Sunday Magazine* explores some of the mysteries surrounding the loss of the plane in mid-Atlantic.  One of the possible theories for the crash involves the pitot tubes used on the Airbus plane.  Pitot tubes are instruments used on aircraft to measure air speed.  The pitot tube measures the difference between total (stagnation) and static pressure to determine dynamic pressure and therefore velocity of the air stream.  Care must be taken to assure that the pitot tubes do not become clogged with ice or other foreign matter as it would interrupt or corrupt the airspeed signal provided to the pilots and the auto-pilot system. 

On the flight 447 aircraft, three Thales AA model pitot tubes were in use.  They are produced by a French company and cost approximately $3500 each.  The Times article goes on to explain:

"...by the summer of 2009, the problem of icing on the Thales AA was known to be especially common….Between 2003 and 2008, there were at least 17 cases in which the Thales AA had problems on the Airbus A330 and its sister plane, the A340.  In September 2007, Airbus issued a ‘service bulletin’ suggesting that airlines replace the AA pitots with a newer model, the BA, which was said to work better in ice.”

Air France’s response to the service bulletin established a policy to replace the AA tubes “only when a failure occurred”.  A year later Air France then asked Airbus for “proof” that the model BA tubes worked better in ice.  It took Airbus another 6-7 months to perform tests that demonstrated the superior performance of the BA tubes, following which Air France proceeded with implementing the recommended change for its A330 aircraft.  Unfortunately the new probes had not yet been installed at the time of flight 447.

Much is still unknown about whether in fact the pitot tubes played a role in the crash of flight 447 and of the details of Air France’s consideration of deploying replacements.  But there is a sufficient framework to pose some interesting questions regarding how safety considerations were balanced in the process, and what might be inferred about the Air France safety culture.  Most clearly it highlights how fundamental the decision making process is to safety culture.

What is clear is that Air France’s approach to this problem “shifted the burden” from assuring that something was safe to proving that it was unsafe.  In legal usage this involves transferring the obligation to prove a fact in controversy from one party to another.  Or in systems thinking (which you may have noticed we strongly espouse) it denotes a classic dynamic archetype - a problem arises, it can be ameliorated through either a short term, symptom based response or a fundamental solution that may take additional time and/or resources to implement.  Choosing the short term fix provides relief and reinforces the belief in the efficacy of the response.  Meanwhile the underlying problem goes unaddressed.  For Air France, the service bulletin created a problem.  Air France could have immediately replaced the pitot tubes or undertaken its own assessment of pitot tubes with replacement to follow.  This would have taken time and resources.  Nor did Air France appear to try to address the threshold question of whether the existing AA model instruments were adequate - in nuclear industry terms, were they “operable” and able to perform their safety function?  Air France apparently did not even implement interim measures such as retraining to improve pilot’s recognition and response to pitot tube failures or incorrect readings.  Instead, Air France shifted the burden back to Airbus to “prove” their recommendation.  The difference between showing that something is not safe versus that it is safe is as wide as, well, the Atlantic Ocean.

What we find particularly interesting about shifting the burden is that it is just another side of the complacency coin.  Most people engaged in safety culture science recognize that complacency is a potential contributor to the decay and loss of effectiveness of safety culture.  Everything appears to be going OK so there is less need to pursue issues, particularly those lacking safety impact clarity.  Not pursuing root causes, not verifying corrective action efficacy, loss of questioning attitude and lack of resources could all be telltale signs of complacency.  The interesting thing about shifting the burden is that it yields much the same result - but with the appearance that action is being taken. 

The footnote to the story is the response of Air Caraibes to similar circumstances in this time frame.  The Times article indicates Air Caraibes experienced two “near misses” with Thales AA pitot tubes on A330 aircraft.  They immediately replaced the parts and notified regulators.


*  W.S. Hylton, "What Happened to Air France Flight 447?" New York Times Magazine (May 4, 2011).

Sunday, April 10, 2011

On the Other Hand

Our prior post on the award of safety performance bonuses at Transocean may have left you, and us, wondering about the ability of large corporations to walk the talk.  Well, better news today with an article from the Wall Street Journal* recounting the decision by Southwest Airlines to preemptively ground its 737s after a fuselage tear on one of the planes.  

As told in the article, the Southwest management appears to have rapidly responded to the event (over a weekend) with technical assessment including advice from Boeing.  The bottom line on the technical side was uncertainty regarding the cause of the failure and the implications for other similar 737s.  It was also clear that Southwest placed the burden on an affirmative showing that the planes were safe rather than requiring evidence that they weren’t.  With the issue “up in the air” the CEO acted quickly and decisively with the grounding order and the conduct of inspections as recommended by Boeing.  

The decision resulted in the cancellation of over 600 flights and no doubt inconvenienced many Southwest passengers, and will have a substantial cost impact to the airline.  The action by Southwest was described as “unusual” as it did not wait for a directive from the government or Boeing to remove planes from service. 

(Ed. note:  Southwest’s current approach is even more remarkable in light of how recently their practices were not exactly on the side of the angels.  In 2008, the FAA fined Southwest $7.5 million for knowingly flying planes that were overdue for mandatory structural inspections.)


*  T.W. Martin, A. Pasztor and P. Sanders, "Southwest's Solo Flight in Crisis," wsj.com (Apr 8, 2011).

Thursday, April 7, 2011

Incredible

“...notwithstanding the tragic loss of life in the Gulf of Mexico, we [Transocean] achieved an exemplary statistical safety record as measured by our total recordable incident rate (‘‘TRIR’’) and total potential severity rate (‘‘TPSR’’).  As measured by these standards, we recorded the best year in safety performance in our Company’s history, which is a reflection on our commitment to achieving an incident free environment, all the time, everywhere.”*

Good grief.  Did Transocean really say this?  Eleven people including nine Transocean employees died in the Deepwater Horizon oil rig explosion.  The quote is from Transocean’s 2010 Annual Report and Proxy recently filed with the SEC.  It provides another illuminating example where the structure and award of management incentives speak much greater volumes than corporate safety rubrics.  (For our report on compensation structures within nuclear power companies and the extent to which such compensation included incentives other than safety, look here and here.)  Or as the saying goes, “Follow the money”.

To fully comprehend how Transocean’s incentive program purports to encourage safety performance we are providing the following additional quotes from its Annual Report.

“Safety Performance.  Our business involves numerous operating hazards and we remain committed to protecting our employees, our property and the environment. Our ultimate goal is expressed in our Safety Vision of ‘‘an incident-free workplace—all the time, everywhere…..

"The [Compensation] Committee measures our safety performance through a combination of our total recordable incident rate (‘‘TRIR’’) and total potential severity rate (‘‘TPSR’’).

•    "TRIR is an industry standard measure of safety performance that is used to measure the frequency of a company’s recordable incidents and comprised 50% of the overall safety metric. TRIR is measured in number of recordable incidents per 200,000 employee hours worked.

•    "TPSR is a proprietary safety measure that we use to monitor the total potential severity of incidents and comprised 50% of the overall safety metric. Each incident is reviewed and assigned a number based on the impact that such incident could have had on our employees and contractors, and the total is then combined to determine the TPSR.

"The occurrence of a fatality may override the safety performance measure.

"….Based on the foregoing safety performance measures, the actual TRIR was 0.74 and the TPSR was 35.4 for 2010. These outcomes together resulted in a calculated payout percentage of 115% for the safety performance measure for 2010. However, due to the fatalities that occurred in 2010, the Committee exercised its discretionary authority to modify the TRIR payout component to zero, which resulted in a modified payout percentage of 67.4% for the safety performance measure." (p. 45)
The treatment of bonuses for Transocean execs was picked up in various media outlets and met with, shall we say, skepticism.  Transocean responded to the blowback with the following:

“We acknowledge that some of the wording in our 2010 proxy statement may have been insensitive in light of the incident that claimed the lives of eleven exceptional men last year and we deeply regret any pain that it may have caused...” **

Note that the apology is directed at the “wording” of the proxy, not to the actual award of bonus compensation for safety performance.  We are tempted here to make some reference to “density” but it is self-evident.

Perhaps realizing that something more would be appropriate, Transocean announced yesterday that members of the senior management team would be donating their bonuses to the Deepwater Horizon Memorial Fund.*** 

Oops, actually they will be donating just the “safety portion” of their bonuses to the fund.  All other bonus amounts and incentive awards are not affected and the Transocean incentive structure for safety performance remains unchanged for 2011.



***  Announcement by Transocean Ltd. Senior Management Team, Zug, Switzerland (Apr 5, 2011 MARKETWIRE via COMTEX).

Monday, April 4, 2011

Combustible Gas

As we observed in our prior blog post, the publication by the Union of Concerned Scientists of their new study of nuclear near misses would likely generate a combustible gas that could find some ignition sources, at least among like-minded nuclear critics.  Thus the March 22, 2011 article* in The Nation magazine was predictable, including the comments by Henry Meyers that the UCS study is evidence of a lack of “serious oversight for twenty years” by the NRC.  Evidence of this includes the reduction in NRC violations and fines in the late 1990s and the contention that then-Chairman Dr. Shirley Jackson caved to political pressure.  Disregarded are the facts that many nuclear plants underwent enormous performance improvement programs in that period and the consolidation of nuclear ownership under a small number of advanced nuclear enterprises.**  These nuclear operators had the significant management, technical  and financial resources to ensure operating excellence in their plants, resulting in much better regulatory compliance.

But it would be a mistake to dismiss lightly the direction that UCS and Christian Parenti of The Nation are taking the post-Fukushima discussion of nuclear safety.  Their thesis is that the current risky state of the nuclear industry in the U.S. (“a fleet of old nuclear plants and the 40,000 tons of nuclear waste they have created”) is due to the lack of strong safety culture, and that the NRC has been compromised through political pressure and the corrosive influence of an inadequate industry safety culture.  Thus,

“...it is imperative to overhaul the inadequate, industry-dominated safety culture that has developed over the past twenty years.  This eroded safety culture is a source of serious danger—and it must be fixed.”

Approaching the current state of nuclear safety from this direction has the potential to open a Davis-Besse size hole in the carefully constructed safety record of the nuclear industry.  By its essence safety culture is perhaps the most far ranging indictment of safety; far more extensive than any specific technical issues that have historically been the target of nuclear critics.  It targets an unprotected flank of both the industry and the NRC; including the recent process where consensus and stakeholder involvement has been emphasized by the NRC to the point that the above quote will gain traction.  The product, a safety culture policy statement by the NRC, something that is not even enforceable, will be framed as a continuation of a lack of “serious oversight” and serve well the newly energized anti-nuclear community. 

*  C. Parenti, "After Three Mile Island: The Rise and Fall of Nuclear Safety Culture," The Nation (Mar 22, 2011).

**  Nuclear industry consolidation was predicted and described in a paper I co-authored with NYPA's Bob Schoenberger, "Capturing Stranded Value in Nuclear Plant Assets," The Electricity Journal 9 (June 1996): 59-65.

Monday, March 21, 2011

Never Let a Good Crisis Go To Waste

“You don’t ever want a crisis to go to waste; it’s an opportunity to do important things that you would otherwise avoid.” So said Rahm Emanuel, memorably, several years ago.  Perhaps taking a page from the Emanuel book, the Union of Concerned Scientists took the opportunity last Thursday to release a report chronicling a series of problems it had investigated at U.S. nuclear plants.*  Apparently the events in Japan pumped plenty of fresh oxygen into the UCS war room in time for them to trot out their latest list of concerns regarding nuclear plant safety.

[UCS senior scientist Edwin] “Lyman was speaking in a conference call with reporters on the release of a report examining critical problems — known as “near misses” — at various nuclear facilities in the United States last year, and the N.R.C.’s handling of critical problems”

David Lochbaum, the author of the report and the director of the nuclear safety program for the organization, was quoted as:

[The report] “also suggested that federal regulators needed to do more to investigate why problems existed in the first place — including examining the overall safety culture of companies that operate nuclear power plants — rather than simply order them to be fixed.”

It could be that the UCS is aiming at the heart of the recent discussions surrounding the NRC’s new policy statement on safety culture.  It is clear that the NRC has little appetite to regulate the safety culture of its licensees; instead urging licensees to maintain a strong safety culture and and taking action only if “results” are not acceptable.  UCS would like specific issues, such as the “near misses” in their report, to be broadly interpreted to establish a more fundamental, cultural flaw in the enterprise itself.

Perhaps the larger question raised by the events in Japan is the dominance of natural phenomena in challenging man-made structures, and whether safety culture provides any insulation.  While the earthquake itself seemed fairly well contained at the nuclear plants, the tsunami easily over powered the sea wall at the facility and caused widespread disability of crucial plant systems.  Does this sound familiar?  Does it remind one of a Category 5 hurricane sweeping aside the levees in New Orleans?  Or the overwhelming forces of an oil well blowout brushing aside the isolation capability of a blowout preventer? 

John McPhee’s 1990 book The Control of Nature chronicles a number of instances of man’s struggle against nature - in his view, one that is inevitably bound to fail.  Often the very acts undertaken to “control nature” contribute to future failures of that control.  McPhee cites the leveeing of the Mississippi, leading to faster channel flows, more silting, more leveeing, and ultimately the kind of macro disaster occurring in Katrina.  Or the “debris bins” built in the canyons above Los Angeles communities.  The bins fill over successive storms, eventually leading to failures of the bins themselves and catastrophic mud and debris floods in the downstream valleys.

It is probably inevitable that in the aftermath of Japan there will be calls to up the design criteria of nuclear plants to higher levels of earthquakes and other natural phenomena.  The expectation will be that this will provide the absolute protection desired by the public or groups such as UCS.  Until of course the next storm or earthquake that is incrementally larger, or in a worse location or in combination with some other event, that supersedes the more stringent assumptions.

Safety culture cannot deliver on an expectation that safety is absolute or without limits. It can and should emphasize the priority and unflagging attention to safety that maximizes the capacity of a facility and its staff to withstand unforeseen challenges .   We know that the Japan event proves the former.  It will be equally important to determine if it also showed the latter.   

*  T.Zeller Jr., "Citing Near Misses, Report Faults Both Nuclear Regulators and Operators," New York Times, Green: A Blog About Energy and the Environment (Mar 17, 2011, 1:50 PM)

Friday, March 11, 2011

Safety Culture Performance Indicators

In our recent post on safety culture management in the DOE complex, we concentrated on documents created by the DOE team.  But there was also some good material in the references assembled by the team.  For example, we saw some interesting thoughts on performance indicators in a paper by Andrew Hopkins, a sociology professor at The Australian National University.*  Although the paper was prepared for an oil and gas industry conference, the focus on overall process safety has parallels with nuclear power production.

Contrary to the view of many safety culture pundits, including ourselves, Professor Hopkins is not particularly interested in separating lagging from leading indicators; he says that trying to separate them may not be a useful exercise.  Instead, he is interested in a company’s efforts to develop a set of useful indicators that in total measure or reflect the state of the organization’s risk control system.  In his words, “. . . the important thing is to identify measures of how well the process safety controls are functioning.  Whether we call them lead or lag indicators is a secondary matter.  Companies I have studied that are actively seeking to identify indicators of process safety do not make use of the lead/lag distinction in any systematic way. They use indicators of failure in use, when these are available, as well as indicators arising out their own safety management activities, where appropriate, without thought as to whether they be lead or lag. . . . Improving performance in relation to these indicators must enhance process safety. [emphasis added]” (p. 11)

Are his observations useful for people trying to evaluate the overall health of a nuclear organization’s safety culture?  Possibly.  Organizations use a multitude of safety culture assessment techniques including (but not limited to) interviews; observations; surveys; assessments of the CAP and other administrative processes, and management metrics such as maintenance performance, all believed to be correlated to safety culture.  Maybe it would be OK to dial back our concern with identifying which of them are leading (if any) and which are lagging.  More importantly, perhaps we should be asking how confident we are that an improvement in any one of them implies that the overall safety culture is in better shape. 

*  A. Hopkins, "Thinking About Process Safety Indicators," Working Paper 53, National Research Centre for OHS Regulation, Australian National University (May 2007).  We have referred to Professor Hopkins’ work before (here and here).