Safetymatters: Organizational and safety culture information, analysis and management: 2024

Tuesday, November 12, 2024

The Failures Lurking Beneath Your Successes

Sidney Dekker recently published an essay* in which he argues that industrial accidents don’t just pop out of nowhere; rather they could have been hiding all along “in the green” which we take to mean behind acceptable (“green”) safety observations and indicators, and successful process outcomes.

He begins by summarizing the current state of research related to industrial accidents and their context in different organizations’ management and culture. Looking at incidents with fatalities, he observes that the evidence supports a finding that both failure-free performance AND a persistent string of minor failures precede accidents. His inference is that accidents arise from the same pool of activities that produce an organization’s successes.

He goes on to describe two different types of organizations, one that is basically dishonest where the lack of reported accidents is caused by active strategies to keep incidents from being reported and a second where there is a genuine lack of incidents over a considerable period. Both types are “in the green” but both are headed for an eventual fall.

Active suppression of bad news

This model starts with significant differences between work as designed and work as it is actually performed. Workers may comply with rules and procedures when under observation but revert to short cuts to meet production goals when no one is looking. The disconnect between the ideal and reality is amplified when the bosses don’t want to hear bad news, e.g., reports about non-conformances or minor incidents. The organization has no interest in analyzing the overall system for connections and interactions that can cause process accidents. Safety targets are apparently being met which misleads top management into thinking safety is under control.

A string of operational success

This organization is actually running accident-free. However, the ultimate failures lie in the processes that produce success. These processes inevitably produce other consequences that can evolve into accidents or even threaten the organization’s existence. Management is confident success will continue indefinitely because the string of successes is visible while the marginal erosion in the system and increasing accident potential is invisible.

Recommended actions

In all organizations, top managers need to understand and question their key processes. In their interactions with middle managers and workers, bosses need to ask: What sacrifices are being made to continue our string of successes? Are middle managers suppressing bad news from flowing upward? What safety tradeoffs are being made to increase efficiency? Are we at the top sending a signal downward that our quest for perfect safety performance (e.g., a Zero Lost Days program) has no room for honesty about problems and bad news? Do we encourage diverse, divergent and dissenting voices and listen for safety-related signals that we would otherwise miss? Active questioning can raise sensitivity to signs that indicate organizational brittleness and an increasing exposure to the risk of unfavorable outcomes.

Our perspective

Dekker is fairly well-known in the safety culture space and we have reviewed his work on several occasions. (Click on the Dekker label for earlier posts.) That said, there is nothing really new here. Several years ago, Dekker was involved with research that showed accidents derived from the same processes as success.** The notion that the seeds of failure are sown in a string of successes for both safety conscious organizations and their production at all costs counterparts is based on an important principle in Safety II thinking: normal system functioning leads to mostly good and occasionally bad results.

Dekker says the mental model for this dates back to Max Weber, we say it goes back at least to G.W.F. Hegel with his concept of the dialectic with its thesis (operational success), growing antithesis (costs and externalities), and their fusion into a synthesis (a new model, or extinction). For example, top management may deny goal conflict (safety vs production vs cost) exists or paper over it with doubletalk but eventually the true priorities will become evident. To see this in action, check out Boeing’s current travails.

One attribute he doesn’t mention is the extent to which organizations will go to “prove” that their current system design – policies, procedures, practices, resources, etc. – is satisfactory and it is individual workers who cause problems and accidents. They simply need more training, more supervision, and/or more discipline. And some are “bad apples” who need to be separated. This is the Safety I mental model in practice.

This essay does not break new ground and you only need to read it if you are unfamiliar with Safety I or II type thinking. However, it does remind us that future accidents are likely hiding in the activities top managers currently regard as successful in perpetuity – or at least for the number of quarters they will be receiving their incentive bonuses based on the organization’s financial performance.

* S. Dekker, “Safety Theater: Where your accidents hide in the green” Oct. 26, 2024.

** We reviewed that work on Oct. 29, 2018.

Thursday, August 22, 2024

Nuclear Regulator and Licensee Safety Culture Interrelationships

The Nuclear Energy Agency (NEA) has published an interesting report* on its research into how nuclear regulators and nuclear power plant operators influence each other’s safety culture (SC). This is a topic that has not received much attention in the past. We will summarize the report and then provide our perspective on it.

To begin, the report’s authors are aware of the power imbalance between the regulator and the licensee. They recognize the regulator’s actions will be more directive and authoritative than the licensee’s. In addition, one of the regulator’s goals is (or should be) to reinforce the licensee’s accountability for safety and strengthen its SC.

That said, there is still the potential for influence to flow in both directions. The desired end state is “a reciprocal, co-operative style of interaction, characterised by respect, openness and trust, with a shared focus on safety and learning.” (p. 9)

How the regulator influences licensee SC

The regulator exhibits certain characteristics (called “enablers”) that create the framework for interactions with the licensee. The enablers are the regulatory regime (including structure, policies, rules, processes, and practices), the regulator’s technical capability, and its leadership and management. The interactions with the licensee include communications, organizational and personal relationships, and the regulator’s behavior. The key points here are that the regulatory regime should be predictable and consistent and the regulator’s behavior must exhibit safety as a core value.

An important aspect of the regulator-licensee relationship is that while the regulator may pull many different levers to influence the licensee’s SC, the regulator must maintain sufficient distance from the licensee to maintain the confidence of the public and politicians in the regulator’s independence. The regulator needs a “Goldilocks” approach, avoiding a prescriptive role that evidences toughness but can diminish the licensee’s sense of accountability for safety while not appearing to be a victim of regulatory capture.

Another characteristic of the regulator is its own feedback loop, i.e., its ability to learn and improve its performance based on experience. If it’s successful, then the regulator’s influence on the licensee, including its SC, may increase; if the regulator can’t or won’t learn and adapt, then its influence on the licensee may actually decrease over time.

How the licensee influences the regulator’s SC

The licensee also has enablers and interactions which appear as a mirror image of the regulator’s. The licensee’s enablers are also the regulatory regime, and licensee’s technical capability, leadership, and management. A big difference is the licensee is more of a “taker” of the regulatory regime although the licensee may have significant, meaningful input into the development of regulatory policies and rules.

The licensee’s interactions with the regulator also include communications, organizational and personal relationships, and the licensee’s behavior. The licensee mainly influences the regulator by the licensee’s actions – the way it communicates and reacts to the regulator’s inquiries, requests, orders, and other actions. The extent to which licensee actions exhibit a strong SC (e.g., a questioning attitude, focus on safety, conservative decision making, and a commitment to safety) will most likely positively affect the regulator’s response and attitudes.

The licensee also has the ability (or inability) to learn and improve its performance based on experience. Self-assessments are a major way the licensee demonstrates to the regulator its commitment to continuous improvement.

One overarching objective of the licensee is to get the regulator to affirm the licensee’s commitment to safety. This is vital for establishing and maintaining positive relationships with the licensee’s stakeholders, i.e., the customers, ratepayers, politicians, and investors (if any) in the external environment. “The licensee benefits from the public being assured of the independent scrutiny applied by the regulatory body.” (p. 28)

Our perspective

While the regulator-licensee mutual influence relationship may not have been studied much, the process of mutual adaptation where one entity simultaneously adapts to and causes changes in another entity is well-known in many fields including business and the social sciences.

There is no surprise that the regulator seeks to influence/strengthen a licensee’s SC; it’s part of the regulator’s job. What’s interesting in this report is the cataloging of the various types of interactions that may result in such influence.

The licensee’s influence on the regulator is more informal, i.e., it’s not backed by any actual authority. On the other hand, if the licensee is more technically competent and squared away than the regulator, it would be difficult to ignore such a role model.

Going to the heart of the matter, how much can/does the licensee affect the regulator’s SC? Through its interactions, including constructive criticism, the licensee can affect the regulatory regime – the direction and content of policies and rules, and the relative harshness of the regulator’s oversight practices – but does that really move the needle on the regulator’s SC? You be the judge.

At best, the regulator and the licensee serve as positive role models for one another.

We applaud the authors for repeatedly asking readers to think holistically and systematically about the regulator-licensee relationship and the socio-legal-political environment in which that relationship exists. They recognize that the larger system has many stakeholders, with competing as well as common interests. We have been proponents of systems thinking since the inception of Safetymatters.

Bottom line: This is a good case study of an under examined social phenomenon. It also has descriptions of desirable SC characteristics sprinkled throughout the text.

* Nuclear Energy Agency, “The Mutual Impact of Nuclear Regulatory Bodies and License Holders from a Safety Culture Perspective,” OECD Publishing, Paris (2024). The NEA is a part of the Organization for Economic Cooperation and Development (OECD).

Tuesday, April 2, 2024

Systems Engineering’s Role in Addressing Society’s Problems

Guru Madhavan, a National Academy of Engineering senior scholar, has a new book about how engineering can contribute to solving society’s most complex and intractable problems. He published a related article* on the National Academies website. The author describes four different types of problems, i.e., decision situations. Importantly, he advocates a systems engineering** perspective for addressing each type. We will summarize his approach and provide our perspective on it.

He begins with a metaphor of clocks and clouds. Clocks operate on logical principles and underlie much of our physical world. Clouds form and reform, no two are alike, they defy logic, only the instant appearance is real – a metaphor for many of our complex social problems.

Hard problems

Hard problems can be essentially bounded. The systems engineer can identify components, interrelationships, processes, desired outcomes, and measures of performance. The system can be optimized by applying mathematics, scientific knowledge, and experience. The system designers’ underlying belief is that a best outcome exists and is achievable. In our view, this is a world of clocks.

Soft problems

Soft problems arise in the field of human behavior, which is complicated by political and psychological factors. Because goals may be unclear, and constraints complicate system design, soft problems cannot be solved like hard problems.

Soft problems involve technology, psychology, and sociology and resolving them may yield an outcome that’s not the best (optimal) but good enough. Results are based on satisficing, an approach that satisfies and suffices. We’d say clouds are forming overhead.

Messy problems

Messy problems emerge from divisions created by people’s differing value sets, belief systems, ideologies, and convictions. An example would be trying to stop the spread of a pathogen while respecting a culture’s traditional burial practices. In these situations, the system designer must try to transform the nature of the entity and/or its environment by dissolving the problem into manageable elements and moving them toward a desired state in which the problem no longer arises. In the example above, this might mean creating dignified burial rituals and promoting safe public health practices.

Wicked problems

The cloudiest problems are the “wicked” ones. A wicked problem emerges when hard, soft, and messy problems simultaneously exist together. This means optimal solutions, satisficing resolutions, and dissolution may also co-exist. A comprehensive model of a wicked problem might show solution(s) within a resolution, and a dissolution might contain resolutions and solutions. As a consequence, engineers need to possess “competency—and consciousness— . . . to develop a balanced blend of hard solutions, soft resolutions, and messy dissolutions to wicked problems.”

Our perspective

People form their mental models of the world based on their education, training, and lived experiences. These mental models are representations of how the world works. They are usually less than totally accurate because of people’s cognitive limitations and built-in biases.

We have long argued that technocrats who traditionally manage and operate complicated industrial facilities, e.g., nuclear power plants, have inadequate mental models, i.e., they are clock people. Their models are limited to cause-effect thinking; their focus is on fixing the obvious hard problems in front of them. As a result, their fixes are limited: change a procedure or component design, train harder, supervise more closely, and apply discipline, including getting rid of the bad apples, as necessary. Rinse and repeat.

In contrast, we assert that problem solving must recognize the existence of complex socio-technical systems. Fixes need to address both physical issues and psychological and social concerns. Analysts must consider relationships between hard and soft system components. Problem solvers need to be cloud people.

Proper systems thinking understands that problems seldom exist in isolation. They are surrounded by a task environment that may contain conflicting goals (e.g., production vs. safety) and a solution space limited by company policies, resource limitations, and organizational politics. The external legal-political environment can also influence goals and further constrain the solution space.

Madhavan has provided some good illustrations of mental models for problem solving, starting with the (relatively) easiest “hard” physical problems and moving through more complicated models to the realm of wicked problems that may, in some cases, be effectively unsolvable.

Bottom line: this is a good refresher for people who are already systems thinkers and a good introduction for people who aren’t.

* G. Madhavan, “Engineering Our Wicked Problems,” National Academy of Engineering Perspectives (March 6, 2024). Online only.

** In Madhavan’s view, systems engineering considers all facets of a problem, recognizes sensitivities, shapes synergies, and accounts for side effects.

Saturday, March 2, 2024

Boeing’s Safety Culture Under the FAA’s Microscope

The Federal Aviation Administration (FAA) recently released its report* on the safety culture (SC) at Boeing. The FAA Expert Panel was tasked with reviewing SC after two crashes involving the latest models of Boeing’s 737 MAX airplanes. The January 2024 door plug blowout happened as the report was nearing completion and reinforces the report’s findings.

737 MAXX door plug

The report has been summarized and widely reported in mainstream media and we will not review all its findings and recommendations here. We want to focus on two parts of the report that address topics we have long promoted as being keys to understanding how strong (or weak) an organization’s SC is, viz., an organization’s decision-making processes and executive compensation. In addition, we will discuss a topic that’s new to us, how to ensure the independence of employees whose work includes assessing company work products from the regulator’s perspective.

Decision-making

An organization’s decision-making processes create some of the most visible artifacts of the organization’s culture: a string of decisions (guided by policies, procedures, and priorities) and their consequences.

The report begins with a clear FAA description of decision-making’s important role in a Safety Management System (SMS) and an organization’s overall management. In part, an “SMS is all about decision-making. Thus it has to be a decision-maker's tool, not a traditional safety program separate and distinct from business and operational decision making.” (p. 10)

However, the panel’s finding on Boeing’s SMS is a mixed bag. “Boeing provided evidence that it is using its SMS to evaluate product safety decisions and some business decisions. The Expert Panel’s review of Boeing’s SMS documentation revealed detailed procedures on how to use SMS to evaluate product safety decisions, but there are no detailed procedures on how to determine which business decisions affect safety or how they should be evaluated under SMS.” (emphasis added) (p. 35)

The associated recommendation is “Develop detailed procedures to determine which business activities should be evaluated under SMS and how to evaluate those decisions.” (ibid.) We think the recommendation addresses the specific problem identified in the finding.

One of the major inputs to a decision-making system is an organization’s priorities. The FAA says safety should always be the top priority but Boeing’s commitment to safety has arguably weakened over time.

“Boeing provided the Expert Panel with a copy of the Boeing Safety Management System Policy, dated April 2022, which states, in part, “… we make safety our top priority.” Boeing revised this policy in August 2023 with . . . a change to the message “we make safety our top priority” to “safety is our foundation.”” (p. 29)

Lowering the bar did not help. “The [Expert] panel observed documentation, survey responses, and employee interviews that did not provide objective evidence of a foundational commitment to safety that matched Boeing’s descriptions of that objective.” (p. 22)

Boeing also created seeds of confusion for its safety decision makers. Boeing implemented its SMS to operate alongside (and not replace or integrate with) its existing safety program.

“During interviews, Boeing employees highlighted that SMS implementation was not to disrupt existing safety program or systems. SMS operating procedure documents spoke of SMS as the overarching safety program but then also provided segregation of SMS-focused activities from legacy safety activities . . .” (p. 24)

Executive compensation

We have long said that if safety performance is important to an organization then their senior managers’ compensation should have a safety performance-related component.

Boeing has included safety in its executive financial incentive program. Safety is one of five factors comprising operational performance which, in turn, is combined with financial performance to determine company-level performance. Because of the weights used in the incentive model, “The Product Safety measure comprised approximately 4% of the overall 2022 Annual Incentive Award.” (p. 28)

Is 4% enough to influence executive behavior? You be the judge.

Employee independence from undue management influence

Boeing’s relationship with the FAA has an aspect that we don’t see in other industries.

Boeing holds an Organization Designation Authorization (ODA) from the FAA. This allows Boeing to “make findings and issue certificates, i.e., perform discretionary functions in engineering, manufacturing, operations, airworthiness, or maintenance on behalf of the [FAA] Administrator.” (p. 12)

Basically, the FAA delegates some of its authority to Boeing employees, the ODA Unit Members (UMs), who then perform certain assessment and certification tasks. “When acting as a representative of the Administrator, an individual is required to perform in a manner consistent with the policies, guidelines, and directives of the FAA. When performing a delegated function, an individual is legally distinct from, and must act independent of, the ODA holder.” (ibid.) These employees are supposed to take the FAA’s view of situations and apply the FAA’s rules even if the FAA’s interests are in conflict with Boeing’s business interests.

This might work in a perfect world but in Boeing’s world, it’s had and has problems, primarily “Boeing’s restructuring of the management of the ODA unit decreased opportunities for interference and retaliation against UMs, and provides effective organizational messaging regarding independence of UMs. However, the restructuring, while better, still allows opportunities for retaliation to occur, particularly with regards to salary and furlough ranking.” (emphasis added) (p. 5) In addition, “The ability to comply with the ODA’s approved procedures is present; however, the integration of the SMS processes, procedures, and data collection requirements has not been accomplished.” (p. 26)

To an outsider, this looks like bad organizational design and practices.

The U.S. commercial nuclear industry offers a useful contrast. The regulator (Nuclear Regulatory Commission) expects its licensees to follow established procedures, perform required tests and inspections, and report any problems to the NRC. Self-reporting is key to an effective relationship built on a base of trust. However, it’s “trust but verify.” The NRC has their own full-time employees in all the power plants, performing inspections, monitoring licensee operations, and interacting with licensee personnel. The inspectors’ findings can lead, and have led, to increased oversight of licensee activities by the NRC.

Our perspective

It’s obvious that Boeing has emphasized production over safety. The problems described above are evidence of broad systemic issues which are not amenable to quick fixes. Integrating SC into everyday decision-making is hard work of the “continuous improvement” variety; it will not happen by management fiat. Adjusting the compensation plan will require the Board to take safety more seriously. Reworking the ODA program to eliminate all pressures and goal conflicts may not be possible; this is a big problem because the FAA has effectively deputized 1,000 people to perform FAA functions at Boeing. (p. 25)

The report only covers the most visible SC issues. Complacency, normalization of deviation, the multitude of biases that can affect decision-making, and other corrosive factors are perennial threats to a strong SC and can affect “the natural drift in organizations.” (p. 40) Such drift may lead to everything from process inefficiencies to tragic safety failures.

Boeing has taken one step: they fired the head of the 737 MAX program.** Organizations often toss a high-level executive into a volcano to appease the regulatory gods and buy some time. Boeing’s next challenge is that the FAA has given Boeing 90 days to fix its quality problems highlighted by the door plug blowout.***

Bottom line: Grab your popcorn, the show is just starting. Boeing is probably too big to fail but it is definitely going to be pulled through the wringer.

* “Section 103Organization Designation Authorizations (ODA) for Transport Airplanes Expert Panel Review Report,” Federal Aviation Administration (Feb. 26, 2024).

** N. Robertson, “Boeing fires head of 737 Max program,” The Hill (Feb. 21, 2024).

*** D. Shepardson and V. Insinna, “FAA gives Boeing 90 days to develop plan to address quality issues,” Reuters (Feb. 28, 2024).

Safetymatters: Organizational and safety culture information, analysis and management

Tuesday, November 12, 2024

The Failures Lurking Beneath Your Successes

Thursday, August 22, 2024

Nuclear Regulator and Licensee Safety Culture Interrelationships

Tuesday, April 2, 2024

Systems Engineering’s Role in Addressing Society’s Problems

Saturday, March 2, 2024

Boeing’s Safety Culture Under the FAA’s Microscope

Search This Blog

Labels

Pages

Subscribe Now

Popular Posts

Recommended Websites

Blog Archive

Contributors

Some Rights Reserved