What Healthcare Can Learn from Boeing 737 MAX Crashes
The airline industry historically has been a paragon of safety and engineering that medicine has learned from. A shift in Boeing’s corporate culture put profits before safety in the decade before the crashes. Risks were under-reported which led to devastating outcomes.
Pilots were not integrated into the design and deployment of the AI system from the start, creating conditions ripe for systemic failure. When new AI systems are integrated into workflows, all workers impacted by these workflows must be notified and trained to identify and respond to potential failures in the systems.
Risk mitigation is shifting to industry with the weakened capacity of the FDA. There is now a need to build new intra-industry collaborations that can develop robust guidelines for risk mitigation and other patient safety measures to fill in the gaps left by the FDA.
The airlines industry depends heavily on safety and developing robust risk mitigation measures to prevent airline fatalities. For this reason, medicine often looks to the airline industry for insights on safety innovation. In recent years, however, Boeing’s reputation has been hit hard from the effects of two major crashes of their 737 MAX aircraft that resulted in the fleet being grounded for a substantial amount of time.
When it comes to software adoption in mission critical use cases, one of the classic lessons learned from the airline industry is the issue of pilots relying too heavily on autopilot against their own judgement; this has been at the root of several airline crashes over history. With the development of AI in these systems, there are even more lessons to be learned in how humans and machines interact in the context of AI/ML.
Are there lessons from Boeing’s experience with AI/ML in the 737 MAX design that are relevant to healthcare and how we mitigate risk? What were the causes of the crashes and design flaws in the Boeing systems, and do they have relevance to AI/ML in healthcare? These are the questions that this blog post will attempt to address.
Some Lessons for Radiology
In 2020, two radiologists from UCSF’s radiology department published their interpretation of the lessons of the Boeing MAX system. As a refresher, the MAX system was created with the goal of aircraft safety and reducing the risk of pilot error; in other words, an automated system (MCAS) for improving safety. These very same systems failed and led to two 737 crashes in both Indonesia and Ethiopia.
The first lesson the authors distilled is that we need to assess AI systems independent of their intended purpose and proper function of the system. We cannot assume that the worst-case failure is equivalent to the functioning of the system without AI; in other words, AI can introduce new types of failures independent of the original function.
Second, the output of an AI system is only as good as the inputs to the system. In the case of Boeing, the planes that crashed did not have additional sensors that were considered optional, but in reality could have assisted pilots in identifying malfunction sooner. Having more inputs to the system as well as visual systems to cross-check the validity of inputs can help users identify malfunctioning systems earlier.
Third, when AI is added to a clinical workflow, all the people in the workflows need to be aware of the AI in the system and receive training that can also help identify malfunctioning systems. The pilots in the planes that crashed were unaware of the MCAS systems and had not received adequate training. Boeing executives at one point claimed that they did not know how providing documentation on the systems to pilots would have helped. This is a major failure of judgement.
Fourth, the MCAS system was a closed system with no humans in the loop. In medicine, this could be very problematic. At the very minimum, a closed loop system needs to alert users that they are activated and have opt-out mechanisms if users feel they cannot trust the system, or it contradicts their medical judgement.
Fifth, the Federal Aviation Agency (FAA) had delegated 90% of the regulatory compliance to the manufacturer in new aircraft design. The FDA’s Pre-Cert program, the authors argue, is very similar to the FAA’s approach. A review of Boeing’s review of risks for the AI system, MCAS, demonstrated that Boeing systematically under-reported risks. The drive to market and profits overruled the need for risk mitigation and safety, leading to tragic outcomes.
Building a Risk Mitigation Culture
The authors of the above article point to a number of errors from Boeing that demonstrate a breakdown in an institutional culture that had put safety first. The airline industry has an extremely good safety record compared to virtually any other industry. Boeing, in the decade preceding the crashes, had relocated their headquarters and shifted the locus of power away from engineers to the business headquarters that privileged short-term earnings over the tradition of building the best, safest technology. This broader shift in corporate culture should not be overlooked.
User input from the outset is vital to AI in healthcare and why human centered design for AI is growing in importance. In the case of Boeing, pilot unions expressed their disagreements with Boeing; the systems were installed without their notification and requisite training. This was cutting corners, to put it mildly.
The regulatory oversight issue is a major failing as well. Unfortunately, Congress has cut the budget for the FDA at a time when a new technology that is scalable, error prone and deployed in the context of medical decision-making is becoming more commonplace. As the FDA is no longer capable of playing a leading role, the industry itself must step in to address this issue. This requires all stakeholders: frontline providers, payers, vendors and researchers. Part of building a culture of safety with AI requires intra-industry consortia that can provide an extra layer of validation beyond what the FDA has the capacity to certify.
Finally, algorithms increasingly surface what we see, and the inverse, what we do not see. Without thorough systems for control and validation, we can set ourselves up for new types of failures and accidents that can be easily amplified as software systems scale. Rushing products to market without thorough safety checks and guidelines is likely not going to end well, as Boeing has shown.