Monday, March 11, 2019 [Tweets] [Favorites]

Safety Experts Weigh in on the Boeing 737 MAX

Max Prosperi (via Yan Zhu):

The preliminary investigation following Lion Air Flight 610 revealed that prior to the crash, a system called Maneuvering Characteristics Augmentation System or MCAS had engaged, without the pilots’ knowledge. The MCAS lowers the nose automatically to prevent a stall, or the loss of lift, if it detects that the angle of the plane’s nose is too high relative to the ground. A malfunctioning sensor may have led the MCAS to engage repeatedly, countering the pilots’ maneuvers.

[…]

Diehl recalled that leading up to the implementation of the MCAS, an FAA official came to him and asked whether or not he thought the automation of aircraft was safe. Diehl’s advice: “Automation, if done right, is great, but it can also bite you.”

After the Lion Air crash, Boeing denied that it had not properly communicated to pilots the addition of the MCAS to the MAX-series 737s, a major difference with previous models of the airplane. (That position contradicts what some airlines have said.)

David Fickling:

A software update intended to fix the problem identified in the Lion Air crash still hasn’t been rolled out. The fact that the crew on Flight 610 are likely to have been aware of the known issues with the aircraft, too, raises the more worrying possibility that there’s an unknown complication.

Andrew:

Two hull loss incidents in under a year on a brand-new aircraft type with only 350 in service.

Compare that to the 787 which, despite serious development problems, has about 800 aircraft in service for many more years, and still zero hull loss incidents.

Or compare the MAX 8 with it’s predecessor, the 737-800, which has had 16 hull loss incidents in a fleet of 5000 aircraft across more than two decades of service.

The 737-MAX seems less safe to operate, whether the reason is an aircraft defect or difficulty of operation, or lack of adequate training.

vbscript2:

Statistics do funny things with such low sample sizes.

Take the 777, for example. It went nearly 20 years, much of that time as the most popular widebody flying, before its first accident resulting in a passenger death. Then it had 3 in a year. It hasn’t had another in the 5 years since that time. Was the 777 any less safe in 2013-2014 than in its other 25 years of service history? Obviously not. Similarly, the A320 family had a streak of fatal crashes in the last few years, yet there’s no reason to believe the A320 isn’t safe, let alone that it’s any less safe than it has been for the rest of its service history.

Update (2019-03-12): See also: Jon Ostrower (via Hacker News) and New York Times (via Hacker News).

McCloud:

I think required reading should be Normal Accidents: Living with High-Risk Technologies by Charles Perrow.

Mac McClellan (via Martin Steiger):

Though the pitch system in the MAX is somewhat new, the pilot actions after a failure are exactly the same as would be for a runaway trim in any 737 built since the 1960s. As pilots we really don’t need to know why the trim is running away, but we must know, and practice, how to disable it.

The problem for Boeing, and maybe eventually all airplane designers, is that FBW avoids these issues. FBW removes the pilot as a critical part of the system and relies on multiple computers to handle failures.

Boeing is now faced with the difficult task of explaining to the media why pilots must know how to intervene after a system failure. And also to explain that airplanes have been built and certified this way for many decades. Pilots have been the last line of defense when things go wrong.

What makes that such a tall order is that FBW airplanes – which include all the recent Airbus fleet, and the 777 and 787 from Boeing – don’t rely on the pilots to handle flight control system failures. FBW uses at least a triple redundant computer control system to interpret the inputs of the cockpit controls by pilots into movement of the airplane flight controls, including the trim. If part of the FBW system fails, the computer identifies the faulty elements and flies on without the human pilots needing to know how to disable the failed system.

Update (2019-03-13): Dallas Morning News (Hacker News):

Pilots repeatedly voiced safety concerns about the Boeing 737 Max 8 to federal authorities, with one captain calling the flight manual “inadequate and almost criminally insufficient” several months before Sunday’s Ethiopian Air crash that killed 157 people, an investigation by The Dallas Morning News found.

Update (2019-03-15): Jon Ostrower (via John Gruber):

Every airplane development is a series of compromises, but to deliver the 737 Max with its promised fuel efficiency, Boeing had to fit 12 gallons into a 10 gallon jug. Its bigger engines made for creative solutions as it found a way to mount the larger CFM International turbines under the notoriously low-slung jetliner.

See also: Hacker News (3)

tuna-piano:

Assuming the author is correct, and the reaction to the MCAS issues is a simple reaction that every pilot should know by memory: Is it really acceptable that once every 3 months a 737-Max will attempt a nose dive and require a vigilant pilot who can identify and correct the issue before the plane crashes into the ground?

And this likely happened at least twice, while there were 300 MAXs in service. If there were 3,000 MAXs in service, MCAS misfires would presumably be happening 3x a month worldwide - each misfire requiring a proper pilot reaction. How can you defend Boeing in that case?

amluto:

Here’s what I don’t get about this whole situation:

AIUI 737 MAX has an instability such that, in near stall conditions, some attempts to recover can make the stall worse. To mitigate this, Boeing added MCAS, and MCAS can malfunction with a single sensor failure. Imagine that this failure occurs and the pilot successfully turns off MCAS but ends up in a dive, too close to the ground, or otherwise in a bad situation. Now the pilot has to recover, but they are facing a faulty AoA indicator (if they have one at all) as well as a plane that, because MCAS is off, is unstable in near-stall conditions. And the pilot has never been trained in the handling of type 737 MAX under these conditions.

Am I wrong for some reason, or is this a potentially rather dangerous situation that could be caused by a single instrument failure?

Update (2019-03-21): Dominic Gates (via Nick Visser):

Current and former engineers directly involved with the evaluations or familiar with the document shared details of Boeing’s “System Safety Analysis” of MCAS, which The Seattle Times confirmed.

The safety analysis:

Understated the power of the new flight control system, which was designed to swivel the horizontal tail to push the nose of the plane down to avert a stall. When the planes later entered service, MCAS was capable of moving the tail more than four times farther than was stated in the initial safety analysis document.

Failed to account for how the system could reset itself each time a pilot responded, thereby missing the potential impact of the system repeatedly pushing the airplane’s nose downward.

Assessed a failure of the system as one level below “catastrophic.” But even that “hazardous” danger level should have precluded activation of the system based on input from a single sensor — and yet that’s how it was designed.

Trevor Sumner (Hacker News):

BEST analysis of what really is happening on the #Boeing737Max issue from my brother in law @davekammeyer, who’s a pilot, software engineer & deep thinker. Bottom line don’t blame software that’s the band aid for many other engineering and economic forces in effect.👇🎖🤔

See also: The Talk Show.

John Cassidy:

Early on, employees of the F.A.A. and Boeing decided how to divide up the certification work. But, partway through the process, a former F.A.A. safety engineer told the Seattle Times, “we were asked by management to re-evaluate what would be delegated. Management thought we had retained too much at the FAA.” The engineer said that “there was constant pressure to re-evaluate our initial decisions,” and “even after we had reassessed it … there was continued discussion by management about delegating even more items down to the Boeing Company.”

Even the work that was retained, such as reviewing technical documents provided by Boeing, was sometimes curtailed. “There wasn’t a complete and proper review of the documents,” the former engineer added. “Review was rushed to reach certain certification dates.”

Alan Levin and Harry Suhartono:

That extra pilot, who was seated in the cockpit jumpseat, correctly diagnosed the problem and told the crew how to disable a malfunctioning flight-control system and save the plane, according to two people familiar with Indonesia’s investigation.

The next day, under command of a different crew facing what investigators said was an identical malfunction, the jetliner crashed into the Java Sea killing all 189 aboard.

[…]

Airline mechanics tried four times to fix related issues on the plane starting Oct. 26, according to the Indonesia preliminary report. After pilots reported issues with incorrect display of speeds and altitude in the two prior flights, workers in Denspasar, Bali, replaced a key sensor that is used by the Boeing plane to drive down its nose if it senses an emergency.

Flight data shows the sensor, called the “angle of attack” vane, which measures whether air is flowing parallel to the length of the fuselage or at an angle, was providing inaccurate readings after that.

Steven Ashley:

At Boeing, safety really is an option: Optional cockpit ‘disagree lights’ wld have alerted pilots that the anti-stall, angle-of-attack (AOA) sensors were not in agreement. But after 2 crashes, they’re suddenly standard equipment...

Update (2019-04-05): Boeing CEO Dennis Muilenburg (via Hacker News):

The full details of what happened in the two accidents will be issued by the government authorities in the final reports, but, with the release of the preliminary report of the Ethiopian Airlines Flight 302 accident investigation, it's apparent that in both flights the Maneuvering Characteristics Augmentation System, known as MCAS, activated in response to erroneous angle of attack information.

Update (2019-04-09): Chris Woodyard:

Boeing “violated a basic principle of aircraft design by allowing a single point failure to trigger a sequence of events that could result in a loss of control,” said Brian Alexander, an attorney for a law firm specializing in aviation accidents, Kreindler & Kreindler in New York, that is contemplating lawsuits on behalf of victims’ families in the Ethiopian Airlines crash.

Update (2019-04-18): Philip Greenspun (via Hacker News):

Had the systems engineers and programmers checked Wikipedia, for example, (or maybe even their own web site) they would have learned that “The critical or stalling angle of attack is typically around 15° – 20° for many airfoils.” Beyond 25 degrees, therefore, it is either sensor error or the plane is stalling/spinning and something more than a slow trim is going to be required.

[…]

We fret about average humans being replaced by robots, but consider the Phoenix resident who sees that the outdoor thermometer is reading 452 degrees F on a June afternoon. Will the human say “Arizona does get hot in the summer so I’m not going to take my book outside for fear that it will burst into flames”? Or “I think I need to buy a new outdoor thermometer”?

Update (2019-05-01): Gregory Travis (via Hacker News):

I have been a pilot for 30 years, a software developer for more than 40. I have written extensively about both aviation and software engineering. Now it’s time for me to write about both together.

Comments

Stay up-to-date by subscribing to the Comments RSS Feed for this post.

Leave a Comment