Thursday, May 29, 2014

A Systems View of Two Industries: Nuclear and Air Transport

We have long promoted a systems view of nuclear facilities and the overall industry.  One consequence of that view is an openness to possible systemic problems as the root causes of incidents in addition to searching for malfunctioning components, both physical and human.

One system where we see this openness is the air transport industry—the air carriers and the Federal Aviation Administration (FAA).  The FAA has two programs for self-reporting of incidents and problems: the Voluntary Disclosure Reporting Program (VDRP) and the Aviation Safety Action Program (ASAP).  These programs are discussed in a recent report* by the FAA’s Office of Inspector General (OIG) and are at least superficially similar to the NRC’s Licensee Event Reporting and Employee Concerns Program.

What’s interesting is that VDRP is receptive to the reporting of both individual and systemic issues.  The OIG report says the difference between individual and systemic is “important because if the issue is systemic, the carrier will have to develop a detailed fix to address the system as a whole—whereas if the issue is more isolated or individual, the fix will be focused more at the employee level, such as providing counseling or training.” (p. 7)  In addition, it appears both FAA programs  are imbued with the concept of a “just culture,” another topic we have posted about on several occasions and which is often associated with a systems view.  A just culture is one where people are encouraged to provide essential safety-related information, the blame game is aggressively avoided, and a clear line exists between acceptable and unacceptable behavior.

Now the implementation of the FAA programs is far from perfect.  As the OIG points out, the FAA doesn't ensure root causes are identified or corrective actions are sufficient and long-lived, and safety data is not analyzed to identify trends that represent risks.  Systemic issues may not always be reported by the carriers or recognized by the FAA.  But overall, there appears to be an effort at open, comprehensive communication between the regulator and the regulated.

So why does the FAA encourage a just culture while the nuclear industry seems fixated on a culture of blame?  One factor might be the NRC’s focus on hardware-centric performance measures.  If these are improving over time, one might infer that any incidents are more likely caused by non-hardware, i.e., humans. 

But perhaps we can gain greater insight into why one industry is more accepting of systemic issues by looking at system-level factors, specifically the operational (or actual) coupling among industry participants versus their coupling as perceived by external observers.**

As a practical matter, the nuclear industry is loosely coupled, i.e., each plant operates more or less independently of the others (even though plants with a common owner are subject to the same policies as other members of the fleet).  There is seldom any direct competition between plants.  However, the industry is viewed by many external observers, especially anti-nukes, as a singular whole, i.e, tightly coupled.  Insiders reinforce this view when they say things like “an accident at one plant is an accident for all.”  And, in fact, one incident (e.g., Davis-Besse) can have industry-wide implications although the physical risk may be entirely local.  In such a socio-political environment, there is implicit pressure to limit or encapsulate the causes of any incidents or irregularities to purely local sources and avoid the mention of possible systemic issues.  The leads to a search for the faulty component, the bad employee, a failure to update a specific procedure or some other local problem that can be fixed by improved leadership and oversight, clearer expectations, more attention to detail, training etc.  The result of this approach (plus other industry-wide factors, e.g., the lack of transparency in certain oversight practices*** and the “special and unique” mantra) is basically a closed system whose client, i.e., the beneficiary of system efforts, is itself.

In contrast, the FAA’s world has two parts, the set of air carriers whose relationship with each another is loosely coupled, similar to the nuclear industry, and the air traffic control (ATC) sub-system, which is more tightly coupled because all the carriers share the same airspace and ATC.  Because of loose coupling, a systemic problem at a single carrier affects only that carrier and does not infect the rest of the industry.  What is most interesting is that a single airline accident (in the tightly coupled portion of the system) does not lead to calls to shut down the industry.  Air transport has no organized opposition to its existence.  Air travel is such an integral part of so many people’s lives that pressure exists to keep the system running even in the face of possible hazards.  As a consequence, the FAA has to occasionally reassert its interest in keeping safety risks from creeping into the system.  Overall, we can say the air transport industry is relatively open, able to admit the existence of problems, even systemic ones, without taking an inadvertent existential risk. 

The foregoing is not intended to be a comprehensive comparison of the two industries.  Rather it is meant to illustrate how one can apply a simple systems concept to gain some insights into why participants in different industries behave differently.  While both the FAA and NRC are responsible for identifying systemic issues in their respective industries, it appears FAA has an easier time of it.  This is not likely to change given the top-level factors described above. 


*  FAA Office of Inspector General, “Further Actions are Needed to Improve FAA’s Oversight of the Voluntary Disclosure Reporting Program” Report No. AV-2014-036 (April 10, 2014).  Thanks to Bill Mullins for pointing out this report to us.

“VDRP provides air carriers the opportunity to voluntarily report and correct areas of non-compliance without civil penalty. The program also provides FAA important safety information that might not otherwise come to its attention.“ (p. 1)  ASAP “allows individual aviation employees to disclose possible safety violations to air carriers and FAA without fear that the information will be used to take enforcement or disciplinary action against them.” (p. 2)

**  “Coupling” refers to the amount of slack, buffer or give between two items in a system.

***  For example, INPO’s board of directors is comprised of nuclear industry CEOs, INPO evaluation reports are delivered in confidence to its members and INPO has basically unfettered access to the NRC.  This is not exactly a recipe for gaining public trust.  See J.O. Ellis Jr. (INPO CEO), Testimony before the National Commission on the BP Deepwater Horizon Oil Spill and Offshore Drilling (Aug. 25, 2010).  Retrieved from NEI website May 27, 2014.

1 comment:

  1. Lew,

    Thanks for this review; I think you sketched some key differences very nicely; the one involving the FAA's employment of a Just Culture Strategy and Approach in the service of illuminating more systemic challenges is particular useful I'd say.

    Viewed from the converse side I might conjecture that NRC's Safety Culture Policy, which ends up being dominated by "individual responsibility" (cf. the NUREG and INPO 12-012) leads to (or perhaps emerges from) this preference, in the day-to-day execution of the nuclear energy enterprise, for "blaming" first and fixing the system later.

    As I've commented elsewhere, if one starts with INSAG-4 and its orientation toward Issues Management (rather than "defect corrective action") then implementation definitely would encompass the inclusion of a Just Culture S&A as an integral feature, not just a hoped for outcome. The latter of course is what NRC goes for with its NSC "expectations." Few in the NS blogosphere seem willing to engage this distinction which seems very clear and substantive to me.

    One comment caught my ear - the description of the ATC as a "subsystem."

    I would frame the relationship between the two co-governance parts of FAA (ATC and Certification Authority) this way:

    For purposes of public "safety issues having over-riding priority" for purposes of being promptly addressed - the driver in the FAA world comes from the ATC system. This is the Return on Objective domain of air transport; its sets the governance priorities throughout the entire FAA domain and it is continually risk-informed.

    It is the weather first, and second the instantaneous system traffic levels in terminal airspace and taxiways which are the externalities (to an individual Certificate holding airline) which are coordinated by FAA systemically. This happens "upstream" for risk reckoning purposes of the individual carriers crafting and implementing their commercial Return on Investment schemes.

    Thus we have the situation where Commercial Reliability is the Left Hand of the system; that is primarily a case of tight technical and other process coupling (e.g. flight turnaround at an intermediate destination) - lots of emphasis on efficiency in operations and consistency in material compliance of aircraft and operators with "licensing" commitments.

    On the Right Hand is ATC Resilience (i.e. effectiveness comes ahead of efficiency) seeking where adaption to coevolving factors is the day to day work - risk reckoning over the scope of the full system occurs at a level above that of the individual profit seeking airline. The FAA leadership is the "brains" of the full system.

    One way of describing this might be that Reliability-Efficiency seeking are externally regulated and Resilience-Effectiveness seeking are self-regulated. It seems to work.

    ReplyDelete

Thanks for your comment. We read them all. We'd like to display them under their respective posts on our main page but that's not how Blogger works.