Guru Madhavan, a National Academy of Engineering senior scholar, has a new book about how engineering can contribute to solving society’s most complex and intractable problems. He published a related article* on the National Academies website. The author describes four different types of problems, i.e., decision situations. Importantly, he advocates a systems engineering** perspective for addressing each type. We will summarize his approach and provide our perspective on it.
He begins with a metaphor of clocks and clouds. Clocks operate on logical principles and underlie much of our physical world. Clouds form and reform, no two are alike, they defy logic, only the instant appearance is real – a metaphor for many of our complex social problems.
Hard problems
Hard problems can be essentially bounded. The systems engineer can identify components, interrelationships, processes, desired outcomes, and measures of performance. The system can be optimized by applying mathematics, scientific knowledge, and experience. The system designers’ underlying belief is that a best outcome exists and is achievable. In our view, this is a world of clocks.
Soft problems
Soft problems arise in the field of human behavior, which is complicated by political and psychological factors. Because goals may be unclear, and constraints complicate system design, soft problems cannot be solved like hard problems.
Soft problems involve technology, psychology, and sociology and resolving them may yield an outcome that’s not the best (optimal) but good enough. Results are based on satisficing, an approach that satisfies and suffices. We’d say clouds are forming overhead.
Messy problems
Messy problems emerge from divisions created by people’s differing value sets, belief systems, ideologies, and convictions. An example would be trying to stop the spread of a pathogen while respecting a culture’s traditional burial practices. In these situations, the system designer must try to transform the nature of the entity and/or its environment by dissolving the problem into manageable elements and moving them toward a desired state in which the problem no longer arises. In the example above, this might mean creating dignified burial rituals and promoting safe public health practices.
Wicked problems
The cloudiest problems are the “wicked” ones. A wicked problem emerges when hard, soft, and messy problems simultaneously exist together. This means optimal solutions, satisficing resolutions, and dissolution may also co-exist. A comprehensive model of a wicked problem might show solution(s) within a resolution, and a dissolution might contain resolutions and solutions. As a consequence, engineers need to possess “competency—and consciousness— . . . to develop a balanced blend of hard solutions, soft resolutions, and messy dissolutions to wicked problems.”
Our perspective
People form their mental models of the world based on their education, training, and lived experiences. These mental models are representations of how the world works. They are usually less than totally accurate because of people’s cognitive limitations and built-in biases.
We have long argued that technocrats who traditionally manage and operate complicated industrial facilities, e.g., nuclear power plants, have inadequate mental models, i.e., they are clock people. Their models are limited to cause-effect thinking; their focus is on fixing the obvious hard problems in front of them. As a result, their fixes are limited: change a procedure or component design, train harder, supervise more closely, and apply discipline, including getting rid of the bad apples, as necessary. Rinse and repeat.
In contrast, we assert that problem solving must recognize the existence of complex socio-technical systems. Fixes need to address both physical issues and psychological and social concerns. Analysts must consider relationships between hard and soft system components. Problem solvers need to be cloud people.
Proper systems thinking understands that problems seldom exist in isolation. They are surrounded by a task environment that may contain conflicting goals (e.g., production vs. safety) and a solution space limited by company policies, resource limitations, and organizational politics. The external legal-political environment can also influence goals and further constrain the solution space.
Madhavan has provided some good illustrations of mental models for problem solving, starting with the (relatively) easiest “hard” physical problems and moving through more complicated models to the realm of wicked problems that may, in some cases, be effectively unsolvable.
Bottom line: this is a good refresher for people who are already systems thinkers and a good introduction for people who aren’t.
* G. Madhavan, “Engineering Our Wicked Problems,” National Academy of Engineering Perspectives (March 6, 2024). Online only.
** In Madhavan’s view, systems engineering considers all facets of a problem, recognizes sensitivities, shapes synergies, and accounts for side effects.
Tuesday, April 2, 2024
Systems Engineering’s Role in Addressing Society’s Problems
Saturday, March 2, 2024
Boeing’s Safety Culture Under the FAA’s Microscope
The Federal Aviation Administration (FAA) recently released its report* on the safety culture (SC) at Boeing. The FAA Expert Panel was tasked with reviewing SC after two crashes involving the latest models of Boeing’s 737 MAX airplanes. The January 2024 door plug blowout happened as the report was nearing completion and reinforces the report’s findings.
![]() |
737 MAXX door plug |
The report has been summarized and widely reported in mainstream media and we will not review all its findings and recommendations here. We want to focus on two parts of the report that address topics we have long promoted as being keys to understanding how strong (or weak) an organization’s SC is, viz., an organization’s decision-making processes and executive compensation. In addition, we will discuss a topic that’s new to us, how to ensure the independence of employees whose work includes assessing company work products from the regulator’s perspective.
Decision-making
An organization’s decision-making processes create some of the most visible artifacts of the organization’s culture: a string of decisions (guided by policies, procedures, and priorities) and their consequences.
The report begins with a clear FAA description of decision-making’s important role in a Safety Management System (SMS) and an organization’s overall management. In part, an “SMS is all about decision-making. Thus it has to be a decision-maker's tool, not a traditional safety program separate and distinct from business and operational decision making.” (p. 10)
However, the panel’s finding on Boeing’s SMS is a mixed bag. “Boeing provided evidence that it is using its SMS to evaluate product safety decisions and some business decisions. The Expert Panel’s review of Boeing’s SMS documentation revealed detailed procedures on how to use SMS to evaluate product safety decisions, but there are no detailed procedures on how to determine which business decisions affect safety or how they should be evaluated under SMS.” (emphasis added) (p. 35)
The associated recommendation is “Develop detailed procedures to determine which business activities should be evaluated under SMS and how to evaluate those decisions.” (ibid.) We think the recommendation addresses the specific problem identified in the finding.
One of the major inputs to a decision-making system is an organization’s priorities. The FAA says safety should always be the top priority but Boeing’s commitment to safety has arguably weakened over time.
“Boeing provided the Expert Panel with a copy of the Boeing Safety Management System Policy, dated April 2022, which states, in part, “… we make safety our top priority.” Boeing revised this policy in August 2023 with . . . a change to the message “we make safety our top priority” to “safety is our foundation.”” (p. 29)
Lowering the bar did not help. “The [Expert] panel observed documentation, survey responses, and employee interviews that did not provide objective evidence of a foundational commitment to safety that matched Boeing’s descriptions of that objective.” (p. 22)
Boeing also created seeds of confusion for its safety decision makers. Boeing implemented its SMS to operate alongside (and not replace or integrate with) its existing safety program.
“During interviews, Boeing employees highlighted that SMS implementation was not to disrupt existing safety program or systems. SMS operating procedure documents spoke of SMS as the overarching safety program but then also provided segregation of SMS-focused activities from legacy safety activities . . .” (p. 24)
Executive compensation
We have long said that if safety performance is important to an organization then their senior managers’ compensation should have a safety performance-related component.
Boeing has included safety in its executive financial incentive program. Safety is one of five factors comprising operational performance which, in turn, is combined with financial performance to determine company-level performance. Because of the weights used in the incentive model, “The Product Safety measure comprised approximately 4% of the overall 2022 Annual Incentive Award.” (p. 28)
Is 4% enough to influence executive behavior? You be the judge.
Employee independence from undue management influence
Boeing’s relationship with the FAA has an aspect that we don’t see in other industries.
Boeing holds an Organization Designation Authorization (ODA) from the FAA. This allows Boeing to “make findings and issue certificates, i.e., perform discretionary functions in engineering, manufacturing, operations, airworthiness, or maintenance on behalf of the [FAA] Administrator.” (p. 12)
Basically, the FAA delegates some of its authority to Boeing employees, the ODA Unit Members (UMs), who then perform certain assessment and certification tasks. “When acting as a representative of the Administrator, an individual is required to perform in a manner consistent with the policies, guidelines, and directives of the FAA. When performing a delegated function, an individual is legally distinct from, and must act independent of, the ODA holder.” (ibid.) These employees are supposed to take the FAA’s view of situations and apply the FAA’s rules even if the FAA’s interests are in conflict with Boeing’s business interests.
This might work in a perfect world but in Boeing’s world, it’s had and has problems, primarily “Boeing’s restructuring of the management of the ODA unit decreased opportunities for interference and retaliation against UMs, and provides effective organizational messaging regarding independence of UMs. However, the restructuring, while better, still allows opportunities for retaliation to occur, particularly with regards to salary and furlough ranking.” (emphasis added) (p. 5) In addition, “The ability to comply with the ODA’s approved procedures is present; however, the integration of the SMS processes, procedures, and data collection requirements has not been accomplished.” (p. 26)
To an outsider, this looks like bad organizational design and practices.
The U.S. commercial nuclear industry offers a useful contrast. The regulator (Nuclear Regulatory Commission) expects its licensees to follow established procedures, perform required tests and inspections, and report any problems to the NRC. Self-reporting is key to an effective relationship built on a base of trust. However, it’s “trust but verify.” The NRC has their own full-time employees in all the power plants, performing inspections, monitoring licensee operations, and interacting with licensee personnel. The inspectors’ findings can lead, and have led, to increased oversight of licensee activities by the NRC.
Our perspective
It’s obvious that Boeing has emphasized production over safety. The problems described above are evidence of broad systemic issues which are not amenable to quick fixes. Integrating SC into everyday decision-making is hard work of the “continuous improvement” variety; it will not happen by management fiat. Adjusting the compensation plan will require the Board to take safety more seriously. Reworking the ODA program to eliminate all pressures and goal conflicts may not be possible; this is a big problem because the FAA has effectively deputized 1,000 people to perform FAA functions at Boeing. (p. 25)
The report only covers the most visible SC issues. Complacency, normalization of deviation, the multitude of biases that can affect decision-making, and other corrosive factors are perennial threats to a strong SC and can affect “the natural drift in organizations.” (p. 40) Such drift may lead to everything from process inefficiencies to tragic safety failures.
Boeing has taken one step: they fired the head of the 737 MAX program.** Organizations often toss a high-level executive into a volcano to appease the regulatory gods and buy some time. Boeing’s next challenge is that the FAA has given Boeing 90 days to fix its quality problems highlighted by the door plug blowout.***
Bottom line: Grab your popcorn, the show is just starting. Boeing is probably too big to fail but it is definitely going to be pulled through the wringer.
* “Section 103Organization Designation Authorizations (ODA) for Transport Airplanes Expert Panel Review Report,” Federal Aviation Administration (Feb. 26, 2024).
** N. Robertson, “Boeing fires head of 737 Max program,” The Hill (Feb. 21, 2024).
*** D. Shepardson and V. Insinna, “FAA gives Boeing 90 days to develop plan to address quality issues,” Reuters (Feb. 28, 2024).
Friday, October 6, 2023
A Straightforward Recipe for Changing Culture
![]() |
Source: COS website |
We recently came across a clear, easily communicated road map for implementing cultural change.* We’ll provide some background information on the author’s motivation for developing the road map, a summary of it, and our perspective on it.
The author, Brian Nosek, is executive director of the Center for Open Science (COS). The mission of COS is to increase the openness, integrity, and reproducibility of scientific research. Specifically, they propose that researchers publish the initial description of their studies so that original plans can be compared with actual results. In addition, researchers should “share the materials, protocols, and data that they produced in the research so that others could confirm, challenge, extend, or reuse the work.” Overall, the COS proposes a major change from how much research is presently conducted.
Currently, a lot of research is done in private, i.e., more or less in secret, usually with the objective of getting results published, preferably in a prestigious journal. Frequent publishing is fundamental to getting and keeping a job, being promoted, and obtaining future funding for more research, in other words, having a successful career. Researchers know that publishers generally prefer findings that are novel, positive (e.g., a treatment is effective), and tidy (the evidence fits together).
Getting from the present to the future requires a significant change in the culture of scientific research. Nosek describes the steps to implement such change using a pyramid, shown below, as his visual model. Similar to Abraham Maslow’s Hierarchy of Needs, a higher level of the pyramid can only be achieved if the lower levels are adequately satisfied.
Source: "Strategy for Culture Change" |
Each level represents a different step for changing a culture:
• Infrastructure refers to an open source database where researchers can register their projects, share their data, and show their work.
• The User Interface of the infrastructure must be easy to use and compatible with researchers' existing workflows.
• New research Communities will be built around new norms (e.g., openness and sharing) and behavior, supported and publicized by the infrastructure.
• Incentives refer to redesigned reward and recognition systems (e.g., research funding and prizes, and institutional hiring and promotion schemes) that motivate desired behaviors.
• Public and private Policy changes codify and normalize the new system, i.e., specify the new requirements for conducting research.
Our Perspective
As long-time consultants to senior managers, we applaud Nosek’s change model. It is straightforward and adequately complete, and can be easily visualized. We used to spend a lot of time distilling complicated situations into simple graphics that communicated strategically important points.
We also totally support his call to change the reward system to motivate the new, desirable behaviors. We have been promoting this viewpoint for years with respect to safety culture: If an organization or other entity values safety and wants safe activities and outcomes, then they should compensate the senior leadership accordingly, i.e., pay for safety performance, and stop promoting the nonsense that safety is intrinsic to the entity’s functioning and leaders should provide it basically for free.
All that said, implementing major cultural change is not as simple as Nosek makes it sound.
First off, the status quo can have enormous sticking power. Nosek acknowledges it is defined by strong norms, incentives, and policies. Participants know the rules and how the system works, in particular they know what they must do to obtain the rewards and recognition. Open research is an anathema to many researchers and their sponsors; this is especially true when a project is aimed at creating some kind of competitive advantage for the researcher or the institution. Secrecy is also valued when researchers may (or do) come up with the “wrong answer” – findings that show a product is not effective or has dangerous side effects, or an entire industry’s functioning is hazardous for society.
Second, the research industry exists in a larger environment of social, political and legal factors. Many elected officials, corporate and non-profit bosses, and other thought leaders may say they want and value a world of open research but in private, and in their actions, believe they are better served (and supported) by the existing regime. The legal system in particular is set up to reinforce the current way of doing business, e.g., through patents.
Finally, systemic change means fiddling with the system dynamics, the physical and information flows, inter-component interfaces, and feedback loops that create system outcomes. To the extent such outcomes are emergent properties, they are created by the functioning of the system itself and cannot be predicted by examining or adjusting separate system components. Large-scale system change can be a minefield of unexpected or unintended consequences.
Bottom line: A clear model for change is essential but system redesigners need to tread carefully.
* B. Nosek, “Strategy for Culture Change,” blog post (June 11th, 2019).
Friday, August 4, 2023
Real Systems Pursue Goals
![]() |
System Model Control Panel |
The article prompted a letter to the editor in which the author said the approach described in the original editorial wasn’t a true systems approach because it wasn’t specifically goal-oriented. We agree with that author’s viewpoint. We often argue for more systems thinking and describe mental models of systems with components, dynamic relationships among the components, feedback loops, control functions such as rules and culture, and decision maker inputs. What we haven’t emphasized as much, probably because we tend to take it for granted, is that a bona fide system is teleological, i.e., designed to achieve a goal.
It’s important to understand what a system’s goal is. This may be challenging because the system’s goal may contain multiple sub-goals. For example, a medical clinician may order a certain test. The lab has a goal: to produce accurate, timely, and reliable results for tests that have been ordered. But the clinician’s goal is different: to develop a correct diagnosis of a patient’s condition. The goal of the hospital of which the clinician and lab are components may be something else: to produce generally acceptable patient outcomes, at reasonable cost, without incurring undue legal problems or regulatory oversight. System components (the clinician and the lab) may have goals which are hopefully supportive of, or at least consistent with, overall system goals.
The top-level system, e.g., a healthcare provider, may not have a single goal, it may have multiple, independent goals that can conflict with one another. Achieving the best quality may conflict with keeping costs within budgets. Achieving perfect safety may conflict with the need to make operational decisions under time pressure and with imperfect or incomplete information. One of the most important responsibilities of top management is defining how the system recognizes and deals with goal conflict.
In addition to goals, we need to discuss two other
characteristics of full-fledged systems: a measure of performance and a defined
client.*
The measure of performance shows the system designers, users, managers, and overseers how well the system’s goal(s) are being achieved through the functioning of system components as affected by the system’s decision makers. Like goals, the measure of performance may have multiple dimensions or sub-measures. In a well-designed system, the summation of the set of sub-measures should be sufficient to describe overall system performance.
The client is the entity whose interests are served by the system. Identifying the client can be tricky. Consider a city’s system for serving its unhoused population. The basic system consists of a public agency to oversee the services, entities (often nongovernmental organizations, or NGOs) that provide the services, suppliers (e.g., landlords who offer buildings for use as housing), and the unhoused population. Who is the client of this system, i.e., who benefits from its functioning? The politicians, running for re-election, who authorize and sustain the public agency? The public agency bureaucrats angling for bigger budgets and more staff? The NGOs who are looking for increased funding? Landlords who want rent increases? Or the unhoused who may be looking for a private room with a lockable door, or may be resistant to accepting any services because of their mental, behavioral, or social problems? It’s easy to see that many system participants do better, i.e., get more pie, if the “homeless problem” is never fully resolved.
For another example, look at the average public school district in the U.S. At first blush, the students are the client. But what about the elected state commissioner of education and the associated bureaucracy that establish standards and curricula for the districts? And the elected district directors and district bureaucracy? And the parents’ rights organizations? And the teachers’ unions? All of them claim to be working to further the students’ interests but what do they really care about? How about political or organizational power, job security, and money? The students could be more of a secondary consideration.
We could go on. The point is we are surrounded by many social-legal-political-technical systems and who and what they are actually serving may not be those they purport to serve.
* These system characteristics are taken from the work of a systems pioneer, Prof. C. West Churchman of UC Berkeley. For more information, see his The Design of Inquiring Systems (New York: Basic Books) 1971.