Showing posts with label Goal Conflict. Show all posts
Showing posts with label Goal Conflict. Show all posts

Friday, August 4, 2023

Real Systems Pursue Goals

System Model Control Panel
System Model Control Panel
On March 10, 2023 we posted about a medical journal editorial that advocated for incorporating more systems thinking in hospital emergency rooms’ (ERs) diagnostic processes.  Consistent with Safetymatters’ core beliefs, we approved of using systems thinking in complicated decision situations such as those arising in the ER. 

The article prompted a letter to the editor in which the author said the approach described in the original editorial wasn’t a true systems approach because it wasn’t specifically goal-oriented.  We agree with that author’s viewpoint.  We often argue for more systems thinking and describe mental models of systems with components, dynamic relationships among the components, feedback loops, control functions such as rules and culture, and decision maker inputs.  What we haven’t emphasized as much, probably because we tend to take it for granted, is that a bona fide system is teleological, i.e., designed to achieve a goal. 

It’s important to understand what a system’s goal is.  This may be challenging because the system’s goal may contain multiple sub-goals.  For example, a medical clinician may order a certain test.  The lab has a goal: to produce accurate, timely, and reliable results for tests that have been ordered.  But the clinician’s goal is different: to develop a correct diagnosis of a patient’s condition.  The goal of the hospital of which the clinician and lab are components may be something else: to produce generally acceptable patient outcomes, at reasonable cost, without incurring undue legal problems or regulatory oversight.  System components (the clinician and the lab) may have goals which are hopefully supportive of, or at least consistent with, overall system goals.

The top-level system, e.g., a healthcare provider, may not have a single goal, it may have multiple, independent goals that can conflict with one another.  Achieving the best quality may conflict with keeping costs within budgets.  Achieving perfect safety may conflict with the need to make operational decisions under time pressure and with imperfect or incomplete information.  One of the most important responsibilities of top management is defining how the system recognizes and deals with goal conflict.

In addition to goals, we need to discuss two other characteristics of full-fledged systems: a measure of performance and a defined client.* 

The measure of performance shows the system designers, users, managers, and overseers how well the system’s goal(s) are being achieved through the functioning of system components as affected by the system’s decision makers.  Like goals, the measure of performance may have multiple dimensions or sub-measures.  In a well-designed system, the summation of the set of sub-measures should be sufficient to describe overall system performance.  

The client is the entity whose interests are served by the system.  Identifying the client can be tricky.  Consider a city’s system for serving its unhoused population.  The basic system consists of a public agency to oversee the services, entities (often nongovernmental organizations, or NGOs) that provide the services, suppliers (e.g., landlords who offer buildings for use as housing), and the unhoused population.  Who is the client of this system, i.e., who benefits from its functioning?  The politicians, running for re-election, who authorize and sustain the public agency?  The public agency bureaucrats angling for bigger budgets and more staff?  The NGOs who are looking for increased funding?  Landlords who want rent increases?  Or the unhoused who may be looking for a private room with a lockable door, or may be resistant to accepting any services because of their mental, behavioral, or social problems?  It’s easy to see that many system participants do better, i.e., get more pie, if the “homeless problem” is never fully resolved.

For another example, look at the average public school district in the U.S.  At first blush, the students are the client.  But what about the elected state commissioner of education and the associated bureaucracy that establish standards and curricula for the districts?  And the elected district directors and district bureaucracy?  And the parents’ rights organizations?  And the teachers’ unions?  All of them claim to be working to further the students’ interests but what do they really care about?  How about political or organizational power, job security, and money?  The students could be more of a secondary consideration.

We could go on.  The point is we are surrounded by many social-legal-political-technical systems and who and what they are actually serving may not be those they purport to serve.

  

*  These system characteristics are taken from the work of a systems pioneer, Prof. C. West Churchman of UC Berkeley.  For more information, see his The Design of Inquiring Systems (New York: Basic Books) 1971.

Thursday, May 3, 2018

Nuclear Safety Culture and the Hanford Waste Treatment Plant: the Saga Continues

WTP at Hanford
In April 2018 the U.S. Government Accountability Office (GAO) released a report* on shortcomings in the quality assurance (QA) program at the Department of Energy’s (DOE) Waste Treatment Plant (WTP aka the Vit Plant) in Hanford, Washington.  QA problems exist at both Bechtel, the prime contractor since 2000, and the DOE’s Office of River Protection (ORP), the on-site overseer of the WTP project.

The report describes DOE actions to identify and address QA problems at the WTP, and examines the extent to which (a) DOE has ensured that all QA problems have been identified and will not recur and (b) ORP’s organizational structure provides sufficient independence to effectively oversee Bechtel’s QA program.

Why do we care about QA?  The GAO investigation did not target culture and there is only one specific mention of culture in the report.**  However, the entire report reflects the weak nuclear safety culture (NSC) at Hanford.

There is a lot of history here (GAO has been ragging DOE about the need for effective oversight at DOE facilities since 2008) but let’s begin with ORP’s 2012 stop work order to address WTP’s most significant technical challenges. Then, in 2013, ORP’s QA division issued two Priority One findings with respect to Bechtel’s QA program, viz., both the program and Bechtel’s Corrective Action Program to address QA problems were “not fully effective.” (p. 3)  This was followed by a DOE Office of Enforcement investigation which, in turn, led to a 2015 Consent Order and Bechtel Management Improvement Program (MIP).  The Order specified all corrective actions had to be implemented by April 20, 2016.  Currently, 13 of 52 total corrective measures have not been completed and some of the ones where Bechtel claimed completion are not yet completed.  In addition, “. . . in some areas where [Bechtel] has stated that corrective measures are now in place, ORP continues to encounter quality assurance problems similar to those it encountered in the past.” (p. 25)

Why doesn’t ORP stop work again?  Because ORP senior managers plan to evaluate the extent of Bechtel’s implementation of MIP corrective measures over the next year and have allowed work to continue because they believe Bechtel’s QA is “generally adequate.” (p. 22)  We’ll reveal the real reason later.

The shortcomings are not limited to Bechtel.  “ORP’s actions have not ensured that all quality assurance problems have been identified at the WTP, and some previously identified problems are recurring.” (p. 16)  “When and where problems have recurred, ORP has not always required [Bechtel] to determine the extent to which the problems may affect all parts of the WTP.” (p. 25)  Why not?  Here’s a hint: ORP’s “Quality Assurance Division is not fully separate and independent from the upper management of the WTP project, which manages cost and schedule performance.” (p. 22)

Our Perspective

An article*** in the local Hanford newspaper summarizes the report’s contents.  However, the problems described are not new news.  Technical, quality and culture problems have swirled around the WTP for years.  In 2011 we started reporting on WTP issues and the sluggish responses from both DOE and Bechtel.  Click on the Vit Plant label to see our previous posts.

Goal conflict (cost and schedule vs. QA and a strong NSC) has always been the overarching issue at the WTP.  Through fiscal year 2017, DOE spent $11 billion on WTP construction.  It will cost approximately $16.8 billion to complete the first phase of the WTP, which transfers low-level radioactive waste to the low-level vitrification facility.  No one knows how much it will cost to complete the WTP or when it will be functioning.

GAO gives their subjects an opportunity to respond to GAO’s reports and recommendations.  The DOE response is an unsurprising continuation of their traditional rope-a-dope strategy: concur with GAO recommendations, rationalize or minimize the current extent of condition, exaggerate current corrective actions, promise to investigate identified issues and do better in the future, wait for GAO’s attention to turn elsewhere, then continue with business as usual.  What DOE needs to do is issue a stop order for the money train—that would get the attention of everyone, especially Bechtel and ORP managers.

How does your QA department stack up?  Does it add value by identifying and helping to solve real problems?  Is it a distracting irritant, enamored of its own authority and administrivia?  Or is it simply impotent?


*  U.S. Government Accountability Office, “Hanford Waste Treatment Plant: DOE Needs to Take Further Actions to Address Weaknesses in Its Quality Assurance Program,” GAO-18-241 (April, 2018).

**  “One [ORP] quality assurance expert specified that ORP’s culture does not encourage staff to identify quality assurance problems or ineffective corrective measures. This expert said that people who discover problems are not rewarded; rather, their findings are met with resistance, which has created a culture where quality assurance staff are hesitant to identify quality assurance problems or problems with corrective measures.” (p. 24)  This quote exposes the core NSC issue at the WTP.

***  A. Cary, “Feds bash Hanford nuclear waste plant troubles, question DOE priorities,” Tri-City Herald (April 24, 2018).  Retrieved May 1, 2018.

Tuesday, June 20, 2017

Learning About Nuclear Safety Culture from the Web, Maybe

The Internet  Source:Wikipedia
We’ve come across some Internet content (one website, one article) that purports to inform the reader about nuclear safety culture (NSC).  This post reviews the content and provides our perspective on its value.

NSC Website

It appears the title of this site is “Nuclear Safety Culture”* and the primary target is journalists who want an introduction to NSC concepts, history and issues.  It is a product of a group of European entities.  It is a professional looking site that covers four major topics; we’ll summarize them in some detail to show their wide scope and shallow depth. 

Nuclear Safety Culture covers five sub-topics:

History traces the shift in attitudes toward and protection from ionizing radiation as the possible consequences became better known but the story ends in the 1950s.  Key actions describe the roles of internal and external stakeholders during routine operations and emergency situations.  The focus is on power production although medicine, industrial uses and weapons are also mentioned.  Definition of NSC starts with INSAG (esp. INSAG-4), then adds INPO’s directive to emphasize safety over competing goals, and a familiar list of attributes from the Nuclear Safety Journal.  As usual, there is nothing in the attributes about executive compensation or the importance of a systems view.  IAEA safety principles are self explanatory.  Key scientific concepts cover the units of radiation for dose, intake and exposure.  Some values are shown for typical activities but only one legal limit, for US airport X-rays, is included.**  There is no information in this sub-topic on how much radiation a person can tolerate or the regulatory limits for industrial exposure.

From Events to Accidents has two sub-topics:

From events to accidents describes the 7-level International Nuclear Event Scale (from a minor anomaly to major accident) but the scale itself is not shown.  This is a major omission.  Defence in depth discusses this important concept but provides only one example, the levels of physical protection between a fuel rod in a reactor and the environment outside the containment.

Controversies has two sub-topics:

Strengths and Weaknesses discuss some of the nuclear industry’s issues and characteristics: industry transparency is a double-edge sword, where increased information on events may be used to criticize a plant owner; general radiation protection standards for the industry; uncertainties surrounding the health effects of low radiation doses; the usual nuclear waste issues; technology evolution through generations of reactors; stress tests for European reactors; supply chain realities where a problem anywhere is used against the entire industry; the political climate, focusing on Germany and France; and energy economics that have diminished nuclear’s competitiveness.  Overall, this is a hodgepodge of topics and a B- discussion.  The human factor provides a brief discussion of the “blame culture” and the need for a systemic view, followed by summaries of the Korean and French document falsification events.

Stories summarizes three events: the Brazilian theft of a radioactive source, Chernobyl and Fukushima.  They are all reported in an overly dramatic style although the basic facts are probably correct.

The authors describe what they call the “safety culture breach” for each event.  The problem is they comingle overarching cultural issues, e.g., TEPCO’s overconfident management, with far more specific failures, e.g., violations of safety and security rules, and consequences of weak NSC, e.g., plant design inadequacies.  It makes one wonder if the author(s) of this section have a clear notion of what NSC is.

It isn’t apparent how helpful this site will be for newbie journalists, it is certainly not a complete “toolkit.”  Some topics are presented in an over-simplified manner and others are missing key figures.  In terms of examples, the site emphasizes major accidents (the ultimate trailing indicators) and ignores the small events, normalization of deviance, organizational drift and other dynamics that make up the bulk of daily life in an organization.  Overall, the toolkit looks a bit like a rush job or unedited committee work, e.g., the section on the major accidents is satisfactory but others are incomplete.  Importantly (or perhaps thankfully) the authors offer no original observations or insights with respect to NSC.  It’s worrisome that what the site creators call NSC is often just the safety practices that evolved as the hazards of radiation became better known. 

NSC Article

There is an article on NSC in the online version of Power magazine.  We are not publishing a link to the article because it isn’t very good; it looks more like a high schooler’s Internet-sourced term paper than a thoughtful reference or essay on NSC.

However, like the stopped clock that shows the correct time twice per day, there can be a worthwhile nugget in such an article.  After summarizing a research paper that correlated plants’ performance indicators with assessments of their NSC attributes (which paper we reviewed on Oct. 5, 2014), the author says “There are no established thresholds for determining whether a safety culture is “healthy” or “unhealthy.””  That’s correct.  After NSC assessors consolidate their interviews, focus groups, observations, surveys and document reviews, they always identify some improvement opportunities but the usual overall grade is “pass.”***  There’s no point score, meter or gauge.  Perhaps there should be.

Our Perspective

Don’t waste your time with pap.  Go to primary sources; an excellent starting point is the survey of NSC literature performed by a U.S. National Laboratory (which we reviewed on Feb. 10, 2013.)  Click on our References label to get other possibilities and follow folks who actually know something about NSC, like Safetymatters.


Nuclear Safety Culture was developed as part of the NUSHARE project under the aegis of the European Nuclear Education Network.   Retrieved June 19, 2017.

**  The airport X-ray limit happens to be the same as the amount of radiation emitted by an ordinary banana.

***  A violation of the Safety Conscious Work Environment (SCWE) regulations is quite different.  There it’s zero tolerance and if there’s a credible complaint about actual retaliation for raising a safety issue, the licensee is in deep doo-doo until they convince the regulator they have made the necessary adjustments in the work environment.

Tuesday, June 9, 2015

Training....Yet Again

U.S. Navy SEALS in Training
We have beat the drum on the value of improved and innovative training techniques for improving safety management performance for some time.  Really since the inception of this blog where our paper, “Practicing Nuclear Safety Management,”* was one of the seminal perspectives we wanted to bring to our readers.  We continue to encounter knowledgeable sources that advocate practice-based approaches and so continue to bring them to our readers’ attention.  The latest is an article from the Harvard Business Review that calls attention to, and distinguishes, “training” as an essential dimension of organizational learning.  The article is “How the Navy SEALS Train for Leadership Excellence.”**  The author, Michael Schrage,*** is a research fellow at MIT who reached out to a former SEAL, Brandon Webb, who transformed SEAL training.  The author contends that training, as opposed to just education or knowledge, is necessary to promote deep understanding of a business or market or process.  Training in this sense refers to actually performing and practicing necessary skills.  It is the key to achieving high levels of performance in complex environments. 

One of Webb’s themes that really struck a chord was: “successful training must be dynamic, open and innovative…. ‘It’s every teacher’s job to be rigorous about constantly being open to new ideas and innovation’, Webb asserts.”  It is very hard to think about much of the training in the nuclear industry on safety culture and related issues as meeting these criteria.  Even the auto industry has recently stepped up to require the conduct of decision simulations to verify the effectiveness of corrective actions - in the wake of the ignition switch-related accidents. (see our
May 22, 2014 post.)

In particular the reluctance of the nuclear industry and its regulator to address the presence and impact of goal conflicts on safety continues to perplex us and, we hope, many others in the industry.   It was on the mind of Carlo Rusconi more than a year ago when he observed: “Some of these conflicts originate high in the organization and are not really amenable to training per se” (see our
Jan. 9, 2014 post.)  However a certain type of training could be very effective in neutralizing such conflicts - practicing making safety decisions against realistic fact-based scenarios.  As we have advocated on many occasions, this process would actualize safety culture principles in the context of real operational situations.  For the reasons cited by Rusconi it builds teamwork and develops shared viewpoints.  If, as we have also advocated, both operational managers and senior managers participated in such training, senior management would be on the record for its assessment of the scenarios including how they weighed, incorporated and assessed conflicting goals in their decisions.  This could have the salutary effect of empowering lower level managers to make tough calls where assuring safety has real impacts on other organizational priorities.  Perhaps senior management would prefer to simply preach goals and principles, and leave the tough balancing that is necessary to implement the goals to their management chain.  If decisions become shaded in the “wrong” direction but there are no bad outcomes, senior management looks good.  But if there is a bad outcome, lower level managers can be blamed, more “training” prescribed, and senior management can reiterate its “safety is the first priority” mantra.


*  In the paper we quote from an article that highlighted the weakness of “Most experts made things worse.  Those managers who did well gathered information before acting, thought in terms of complex-systems interactions instead of simple linear cause and effect, reviewed their progress, looked for unanticipated consequences, and corrected course often. Those who did badly relied on a fixed theoretical approach, did not correct course and blamed others when things went wrong.”  Wall Street Journal, Oct. 22, 2005, p. 10 regarding Dietrich Dörner’s book, The Logic of Failure.  For a comprehensive review of the practice of nuclear safety, see our paper “Practicing Nuclear Safety Management”, March 2008.

**  M. Schrage, "How the Navy SEALS Train for Leadership Excellence," Harvard Business Review (May 28, 2015).

***  Michael Schrage, a research fellow at MIT Sloan School’s Center for Digital Business, is the author of the book Serious Play among others.  Serious Play refers to experiments with models, prototypes, and simulations.

Sunday, March 29, 2015

Nuclear Safety Assessment Principles in the United Kingdom

A reader sent us a copy of “Safety Assessment Principles for Nuclear Facilities” (SAPs) published by the United Kingdom’s Office for Nuclear Regulation (ONR).*  For documents like this, we usually jump right to the treatment of safety culture (SC).  However, in this case we were impressed with the document’s accessibility, organization and integrated (or holistic) approach so we want to provide a more general review.

ONR uses the SAPs during technical assessments of nuclear licensees’ safety submissions.  The total documentation package developed by a licensee to demonstrate high standards of nuclear safety is called the “safety case.”

Accessibility

The language is clear and intended for newbies as well as those already inside the nuclear tent.  For example, “The SAPs contain principles and guidance.  The principles form the underlying basis for regulatory judgements made by inspectors, and the guidance associated with the principles provides either further explanation of a principle, or their interpretation in actual applications and the measures against which judgements can be made.” (p. 11) 

Also furthering ease of use, the document is not strewn with acronyms.  As a consequence, one doesn’t have to sit with glossary in hand just to read the text.

Organization

ONR presents eight fundamental principles including responsibility for safety, limitation of risks to individuals and emergency planning.  We’ll focus on another fundamental principle, Leadership and Management (L&M) because (a) L&M activities create the context and momentum for a positive SC and (b) it illustrates holistic thinking.

L&M is comprised of four subordinate (but still high-level) inter-related principles: leadership, capable organization, decision making and learning.  “Because of their inter-connected nature there is some overlap between the principles. They should therefore be considered as a whole and an integrated approach will be necessary for their delivery.” (p. 18)

Drilling down further, the guidance for leadership includes many familiar attributes.  We want to acknowledge attributes we have been emphasizing on Safetymatters or reflect new thoughts.  Specifically, leaders must recognize and resolve conflict between safety and other goals, ensure that the reward systems promote the identification and management of risk, encourage safe behavior and discourage unsafe behavior or complacency; and establish a common purpose and collective social responsibility for safety. (p.19) 

Decision making (another Safetymatters hot button issue) receives a good treatment.  Topics covered include explicit recognition of goal conflict; appreciating the potential for error, uncertainty and the unexpected; and the essential functions of active challenges and a questioning attitude.

We do have one bone to pick under L&M: we would like to see words to the effect that safety performance and SC should be significant components of the senior management reward system.

Useful Points

Helpful nuggets pop up throughout the text.  A few examples follow.

“The process of analysing safety requires creativity, where people can envisage the variety of routes by which radiological risks can arise from the technology. . . . Safety is achieved when the people and physical systems together reliably control the radiological hazards inherent in the technology. Therefore the organizational systems (ie interactions between people) are just as important as the physical systems, . . . “ (pp. 25-26)

“[D]esigners and/or dutyholders may wish to put forward safety cases that differ from [SAP] expectations.   As in the past, ONR inspectors should consider such submissions on their individual merits. . . . ONR will need to be assured that such cases demonstrate equivalence to the outcomes associated with the use of the principles here,. . .” (p. 14)  The unstated principle here is equifinality; in more colorful words, there is more than one way to skin a cat.

There are echoes of other lessons we’ve been preaching on Safetymatters.  For example “The principle of continuous improvement is central to achieving sustained high standards of nuclear safety. . . . Seeking and applying lessons learned from events, new knowledge and experience, both nationally and internationally, must be a fundamental feature of the safety culture of the nuclear industry.” (p. 13)

And, in a nod to Nicholas Taleb, if a “hazard is particularly high, or knowledge of the risk is very uncertain, ONR may choose to concentrate primarily on the hazard.” (p. 8)

Our Perspective

Most of the content of the SAPs will be familiar to Safetymatters readers.  We suggest you skim the first 23 pages of the document covering introductory material and Leadership & Management.  SAPs is an excellent example of a regulator actually trying to provide useful information and guidance to current and would-be licensees and is far better than the simple-minded laundry lists promulgated by IAEA.


*  Office for Nuclear Regulation, “Safety Assessment Principles for Nuclear Facilities” Rev. 0 (2014).  We are grateful to Bill Mullins for forwarding this document to us.

Thursday, January 29, 2015

Safety Culture at Chevron’s Richmond, CA Refinery



The U.S. Chemical Safety and Hazard Investigation Board (CSB) released its final report* on the August 2012 fire at the Chevron refinery in Richmond, CA caused by a leaking pipe.  In the discussion around the CSB’s interim incident report (see our April 16, 2013 post) the agency’s chairman said Chevron’s safety culture (SC) appeared to be a factor in the incident.  This post focuses on the final report findings related to the refinery’s SC.

During their investigation, the CSB learned that some personnel were uncomfortable working around the leaking pipe because of potential exposure to the flammable fluid.  “Some individuals even recommended that the Crude Unit be shut down, but they left the final decision to the management personnel present.  No one formally invoked their Stop Work Authority.  In addition, Chevron safety culture surveys indicate that between 2008 and 2010, personnel had become less willing to use their Stop Work Authority. . . . there are a number of reasons why such a program may fail related to the ‘human factors’ issue of decision-making; these reasons include belief that the Stop Work decision should be made by someone else higher in the organizational hierarchy, reluctance to speak up and delay work progress, and fear of reprisal for stopping the job.” (pp. 12-13) 

The report also mentioned decision making that favored continued production over safety. (p. 13)  In the report’s details, the CSB described the refinery organization’s decisions to keep operating under questionable safety conditions as “normalization of deviance,” a term popularized by Diane Vaughn and familiar to Safetymatters readers. (p. 105) 

The report included a detailed comparison of the refinery’s 2008 and 2010 SC surveys.  In addition to the decrease in employees’ willingness to use their Stop Work Authority, surveyed operators and mechanics reported an increased belief that using such authority could get them into trouble (p. 108) and that equipment was not properly cared for. (p. 109) 

Our Perspective

We like the CSB.  They’re straight shooters and don’t mince words.  While we are not big fans of SC surveys, the CSB’s analysis of Chevron’s SC surveys appears to show a deteriorating SC between 2008 and 2010. 

Chevron says they agree with some CSB findings however Chevron believes “the CSB has presented an inaccurate depiction of the Richmond Refinery’s current process safety culture.”  Chevron says “In a third-party survey commissioned by Contra Costa County, when asked whether they feel free to use Stop Work Authority during any work activity, 93 percent of Chevron refinery workers responded favorably.  The overall results for the process safety survey exceeded the survey taker’s benchmark for North American refineries.”**  Who owns the truth here?  The CSB?  Chevron?  Both?    

In 2013, the city of Richmond adopted an Industrial Safety Ordinance (RISO) that requires Chevron to conduct SC assessments, preserve records and develop corrective actions.  The CSB recommendations including beefing up the RISO to evaluate the quality of Chevron’s action items and their actual impact on SC. (p. 116)

Chevron continues to receive blowback from the incident.  The refinery is the largest employer and taxpayer in Richmond.  It’s not a company town but Chevron has historically had a lot of political sway in the city.  That’s changed, at least for now.  In the recent city council election, none of the candidates backed by Chevron was elected.***

As an aside, the CSB report referenced a 2010 study**** that found a sample of oil and gas workers directly intervened in only about 2 out of 5 of the unsafe acts they observed on the job.  How diligent are you and your colleagues about calling out safety problems?


*  CSB, “Final Investigation Report Chevron Richmond Refinery Pipe Rupture and Fire,” Report No. 2012-03-I-CA (Jan. 2015).

**  M. Aldax, “Survey finds Richmond Refinery safety culture strong,” Richmond Standard (Jan. 29, 2015).  Retrieved Jan. 29, 2015.  The Richmond Standard is a website published by Chevron Richmond.

***  C. Jones, “Chevron’s $3 million backfires in Richmond election,” SFGate (Nov. 5, 2014).  Retrieved Jan. 29, 2015.

****  R.D. Ragain, P. Ragain, Mike Allen and Michael Allen, “Study: Employees intervene in only 2 of 5 observed unsafe acts,” Drilling Contractor (Jan./Feb. 2011).  Retrieved Jan. 29, 2015.

Monday, October 13, 2014

Systems Thinking in Air Traffic Management


A recent white paper* presents ten principles to consider when thinking about a complex socio-technical system, specifically European Air Traffic Management (ATM).  We review the principles below, highlighting aspects that might provide some insights for nuclear power plant operations and safety culture (SC).

Before we start, we should note that ATM is truly a complex** system.  Decisions involving safety and efficiency occur on a continuous basis.  There is always some difference between work-as-imagined and work-as-done.

In contrast, we have argued that a nuclear plant is a complicated system but it has some elements of complexity.  To the extent complexity exists, treating nuclear like a complicated machine via “analysing components using reductionist methods; identifying ‘root causes’ of problems or events; thinking in a linear and short-term way; . . . [or] making changes at the component level” is inadequate. (p. 5)  In other words, systemic factors may contribute to observed performance variability and frustrate efforts to achieve the goal in nuclear of eliminating all differences between work-as-planned and work-as-done.

Principles 1-3 relate to the view of people within systems – our view from the outside and their view from the inside.

1. Field Expert Involvement
“To understand work-as-done and improve how things really work, involve those who do the work.” (p. 8)
2. Local Rationality
“People do things that make sense to them given their goals, understanding of the situation and focus of attention at that time.” (p. 10)
3. Just Culture
“Adopt a mindset of openness, trust and fairness. Understand actions in context, and adopt systems language that is non-judgmental and non-blaming.” (p. 12)

Nuclear is pretty good at getting line personnel involved.  Adages such as “Operations owns the plant” are useful to the extent they are true.  Cross-functional teams can include operators or maintenance personnel.  An effective CAP that allows workers to identify and report problems with equipment, procedures, etc. is good; an evaluation and resolution process that involves members from the same class of workers is even better.  Having someone involved in an incident or near-miss go around to the tailgates and classes to share the lessons learned can be convincing.

But when something unexpected or bad happens, nuclear tends to spend too much time looking for the malfunctioning component (usually human).   “The assumption is that if the person would try harder, pay closer attention, do exactly what was prescribed, then things would go well. . . . [But a] focus on components becomes less effective with increasing system complexity and interactivity.” (p. 4)  An outside-in approach ignores the context in which the human performed, the information and time available, the competition for focus of attention, the physical conditions of the work, fatigue, etc.  Instead of insight into system nuances, the result is often limited to more training, supervision or discipline.

The notion of a “just culture” comes from James Reason.  It’s a culture where employees are not punished for their actions, omissions or decisions that are commensurate with their experience and training, but where gross negligence, willful violations and destructive acts are not tolerated.

Principles 4 and 5 relate to the system conditions and context that affect work.

4. Demand and Pressure
“Demands and pressures relating to efficiency and capacity have a fundamental effect on performance.” (p. 14)
5. Resources & Constraints

“Success depends on adequate resources and appropriate constraints.” (p. 16)

Fluctuating demand creates far more varied and unpredictable problems for ATM than it does in nuclear.  However, in nuclear the potential for goal conflicts between production, cost and safety is always present.  The problem arises from acting as if these conflicts don’t exist.

ATM has to “cope with variable demand and variable resources,” a situation that is also different from nuclear with its base load plants and established resource budgets.  The authors opine that for ATM, “a rigid regulatory environment destroys the capacity to adapt constantly to the environment.” (p. 2) Most of us think of nuclear as quite constrained by procedures, rules, policies, regulations, etc., but an important lesson from Fukushima was that under unforeseen conditions, the organization must be able to adapt according to local, knowledge-based decisions  Even the NRC recognizes that “flexibility may be necessary when responding to off-normal conditions.”***

Principles 6 through 10 concern the nature of system behavior, with 9 and 10 more concerned with system outcomes.  These do not have specific implications for SC other than keeping an open mind and being alert to systemic issues, e.g., complacency, drift or emergent behavior.

6. Interactions and Flows
“Understand system performance in the context of the flows of activities and functions, as well as the interactions that comprise these flows.” (p. 18)
7. Trade-Offs
“People have to apply trade-offs in order to resolve goal conflicts and to cope with the complexity of the system and the uncertainty of the environment.” (p. 20)
8. Performance variability
“Understand the variability of system conditions and behaviour.  Identify wanted and unwanted variability in light of the system’s need and tolerance for variability.” (p. 22)
9. Emergence
“System behaviour in complex systems is often emergent; it cannot be reduced to the behaviour of components and is often not as expected.” (p. 24)
10. Equivalence
“Success and failure come from the same source – ordinary work.” (p. 26)

Work flow certainly varies in ATM but is relatively well-understood in nuclear.  There’s really not much more to say on that topic.

Trade-offs occur in decision making in any context where more than one goal exists.  One useful mental model for conceptualizing trade-offs is Hollnagel’s efficiency-thoroughness construct, basically doing things quickly (to meet the production and cost goals) vs. doing things well (to meet the quality and possibly safety goals).  We reviewed his work on Jan. 3, 2013.

Performance variability occurs in all systems, including nuclear, but the outcomes are usually successful because a system has a certain range of tolerance and a certain capacity for resilience.  Performance drift happens slowly, and can be difficult to identify from the inside.  Dekker’s work speaks to this and we reviewed it on Dec. 5, 2012.

Nuclear is not fully complex but surprises do happen, some of them not caused by component failure.  Emergence (problems that arise from new or unforeseen system interactions) is more likely to occur following the implementation of new technical systems.  We discussed this possibility in a July 6, 2013 post on a book by Woods, Dekker et al.

Equivalence means that work that results in both good and bad outcomes starts out the same way, with people (saboteurs excepted) trying to be successful.  When bad things happen, we should cast a wide net in looking for different factors, including systemic ones, that aligned (like Swiss cheese slices) in the subject case.

The white paper also includes several real and hypothetical case studies illustrating the application of the principles to understanding safety performance challenges 

Our Perspective 

The authors draw on a familiar cast of characters, including Dekker, Hollnagel, Leveson and Reason.  We have posted about all these folks, just click on their label in the right hand column.

The principles are intended to help us form a more insightful mental model of a system under consideration, one that includes non-linear cause and effect relationships, and the possibility of emergent behavior.  The white paper is not a “must read” but may stimulate useful thinking about the nature of the nuclear operating organization.


*  European Organisation for the Safety of Air Navigation(EUROCONTROL), “Systems Thinking for Safety: Ten Principles” (Aug. 2014).  Thanks to Bill Mullins for bringing this white paper to our attention.

**  “[C]omplex systems involve large numbers of interacting elements and are typically highly dynamic and constantly changing with changes in conditions. Their cause-effect relations are non-linear; small changes can produce disproportionately large effects. Effects usually have multiple causes, though causes may not be traceable and are socially constructed.” (pp. 4-5)

Also see our Oct. 14, 2013 discussion of the California Independent System Operator for another example of a complex system.

***  “Work Processes,” NRC Safety Culture Trait Talk, no. 2 (July 2014), p. 1.  ADAMS ML14203A391.  Retrieved Oct. 8, 2014