Annals of quantitative overconfidence, Boeing edition

By Felix Salmon
March 8, 2013

On January 7, the auxiliary power unit (APU) of a Boeing 787 caught fire at Logan airport. The APU is a lithium-ion battery, roughly 1-foot cube, and the consequences of a fire can easily be catastrophic. There was no one on the plane at the time, which is lucky, because the fire was extremely difficult to extinguish, with firefighters encountering “no visibility” thanks to thick smoke. What’s more, the “quick-disconnect knob” was melted. In flight, these batteries control critical flight systems: they cannot fail.

And yet, twice in 58,000 hours of usage, the lithium batteries on the new 787 contrived to catch fire; this is obviously not something the FAA — or even Boeing, for that matter — can risk happening again. There’s really only one thing to be done: all lithium batteries on the 787 must be swapped out for nickel-cadmium or lead-acid batteries, which have the great advantage that they don’t catch fire.

The bigger story here, however, is about engineers’ hubris and regulatory capture. As the interim report from the National Transportation Safety Board says, the FAA was well aware, when Boeing said it wanted to use lithium batteries, that such batteries are inherently dangerous and have a tendency to catch fire whenever they are used elsewhere.

But Boeing persisted, and came up with some hilariously overprecise probability estimates. The batteries would only emit gas or smoke once every 10 million hours, the company calculated, and would only catch fire once every billion hours. The reasoning is bonkers: Boeing’s analysis “determined that overcharging was the only known failure mode that could result in cell venting with fire”. They then contrived to conclude that if they put in overcharge protections, the risk of overcharging would be brought down to one in a billion, and that therefore the risk of a fire would also be brought down to one in a billion.

As Steve LeVine notes, Nassim Taleb would take one look at that reasoning and simply laugh. For one thing, how on earth is it possible to determine that the risk of an overcharge is less than or equal to one in a billion? Probabilities that small simply can’t be measured. And more importantly, how did Boeing determine that the probability of a fire absent an overcharge was zero? There’s good evidence that neither of the battery fires were caused by an overcharge — but Boeing seems to have decided that fires caused for any non-overcharge reason were, literally, impossible. Once again, it’s incredibly hard to conceive of any coherent line of reasoning which could come to that conclusion.

But somehow the FAA accepted Boeing’s analysis at face value, and allowed Boeing to install lithium batteries on its planes, just as long as certain safeguards were put in place.

This is the same kind of literal quantitative thinking which helped cause the financial crisis. Put engineers in charge of something, and they’ll measure what they can measure, they won’t measure what they can’t measure, and they’ll protect against only the things they managed to foresee. And as all of us who spend our lives surrounded by electronic devices know, sometimes they fail. In a sense it doesn’t matter what the reason is: failure is just a fact of life, which is a real problem when failure could mean the fiery death of hundreds of people.

Statistically speaking, airplanes are safer today than they’ve ever been. And electronics are a key part of that trend: they might occasionally fail, but they are also increasingly good at preventing human error, or just at doing the things that fallible humans used to do, only much more reliably. That said, as airplane engineers stop being grease monkeys and start being coders, we’re losing a certain amount of holistic and heuristic understanding of how to ensure real-world safety.

If you basically outsource an entire airplane, as Boeing did, you lose your institutional ability to ensure that airplane is safe. And sadly, it seems that Boeing’s failures on that front will automatically cascade down to the FAA. The reports and post-mortems surrounding the lithium batteries’ safety will be very deep. Let’s hope the FAA is just as critical when it comes to its own decision to accept Boeing’s analysis at face value.

23 comments

We welcome comments that advance the story through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can flag it to our editors by using the report abuse links. Views expressed in the comments do not represent those of Reuters. For more information on our comment policy, see http://blogs.reuters.com/fulldisclosure/2010/09/27/toward-a-more-thoughtful-conversation-on-stories/

“There’s really only one thing to be done: all lithium batteries on the 787 must be swapped out for nickel-cadmium or lead-acid batteries, which have the great advantage that they don’t catch fire.”

You know, jet fuel can catch fire under the right conditions. It doesn’t happen often on a plane, except when one bumps into something else unexpectedly. The aircraft designers also keep sparks and flames away from the fuel, but I’m guessing there is a minute chance that under the right conditions, the fuel can catch fire.

People hold phones containing lithium ion batteries close to their head and other body parts, and in all kinds of places where you wouldn’t want them to catch fire. There is a statistical risk that it will happen, though, but designers make choices that minimize those risks. The risks aren’t zero.

When Boeing says there is a 1 in a billion chance that the batteries can catch fire, that is based on their known failure modes. Obviously, they cannot calculate the possibility of a fire from some unknown cause, just like you cannot predict the possibility of your ceiling collapsing on you (but the building designers could probably calculate the chances of the collapse under specific conditions). But that probability is most likely valid, using the same methodology used to predict things like MTBF (mean time between failures), used by engineers for almost every product they design. Every component has known failure modes (like a capacitor in the power supply of a computer will only last so long before it fails and renders your computer worthless), and those failure modes and probabilities can be used to calculate the probability of a specific fault occurring. It’s the same kind of math and science that Ford used to estimate the cost of re-designing the fuel tank on a Pinto so it wouldn’t explode, which they decided not to do because the cost of a safer design would be greater than the expenses they would face when people died when the Pinto was rear-ended and caught fire.

I don’t argue that Boeing’s development and business strategy was flawed, but that wasn’t the cause of this battery problem. And I doubt airplane engineers have been grease monkeys for at least the past 90 years or so, especially those who designed Boeing’s jets.

I also don’t know how you can expect the FAA to be as expert as Boeing is on all facets of the jet’s design; that would imply an awful lot of engineering knowledge, and it’s unlikely they possess that much.

Posted by KenG_CA | Report as abusive

Richard Feynman’s appendix to the Rogers Commission’s Challenger explosion report covers this same sort of issue in damning detail, and is excellent reading in general.

http://science.ksc.nasa.gov/shuttle/miss ions/51-l/docs/rogers-commission/Appendi x-F.txt

Posted by gregbrown | Report as abusive

All that being said by me, not testing the battery system is inexcusable:

http://gizmodo.com/5989580/boeing-never- fully-tested-the-design-of-the-dreamline r-battery-that-caught-fire

Posted by KenG_CA | Report as abusive

Felix,

1) Its not about ‘engineers’ hubris’. Too many of them (with experience) have been heaved overboard, by Boeing ‘management’, in its misguided strategy to become a civil aerospace maquiladora. Of greater concern is their completely clueless response to this near catastrophe, replete with shameless lies. Take this Boeing exec, for example: http://blogs.crikey.com.au/planetalking/ 2013/03/06/boeings-mr-conner-has-some-fi guring-out-to-explain/
2) And, Ken G. : Being expert on all the technologies deployed in civil airliners is PRECISELY the FAA’s job.

Posted by crocodilechuck | Report as abusive

chuck, they don’t have the budget for that. Not even close.

But hey, maybe they can outsource it.

Posted by KenG_CA | Report as abusive

The batteries are not the APU. The APU is an independent gas turbine that provides electrical power for ground operations and to start the engines. The aft battery is used to start the APU. The forward battery provides electrical power on the ground when the APU isn’t running.

In flight, the batteries power critical systems only in the event that neither engine, or the APU, is capable of generating electrical power – an extremely unlikely event. Batteries can fail in flight with no impact on the airplane so long as the generators are working. What they can’t do is catch on fire, firstly because any inflight fire is dangerous, and secondly because Li-ion battery fires burn very hot and are extremely hard to extinguish.

The problem with Boeing is not, as Felix implies, that engineers are making decisions. It appears that Boeing had an extremely strong engineering culture, extending all the way to the top, prior to its merger with McDonnell Douglas – a merger that’s been described as MD buying Boeing with Boeing’s money. The old Boeing would never had outsourced so much of the airplane as the new Boeing did with the 787.

Posted by bratschewurst | Report as abusive

The batteries are not the APU. The APU is an independent gas turbine that provides electrical power for ground operations and to start the engines. The aft battery is used to start the APU.

In flight, the batteries power critical systems only in the event that neither engine, or the APU is capable of generating electrical power – an extremely unlikely event. Batteries can fail in flight. What they can’t do is catch on fire, firstly because any inflight fire is dangerous, and secondly because Li-ion battery fires burn very hot and are extremely hard to extinguish.

The problem with Boeing is not, as Felix implies, that engineers are making decisions. It appears that Boeing had an extremely strong engineering culture, extending all the way to the top, prior to its merger with McDonnell Douglas – a merger that’s been described as MD buying Boeing with Boeing’s money. The old Boeing would never had outsourced so much of the airplane as the new Boeing did with the 787.

Posted by bratschewurst | Report as abusive

Ken,

They already did.

;)

Posted by crocodilechuck | Report as abusive

I think you might be slandering engineers in this post. I don’t know the details of the Boeing thing, but what you write reminds me very much of Edward Tufte’s analysis of how the Challenger disaster happened: engineers being careful, and managers overruling them.

Posted by seanmatthews | Report as abusive

“There’s really only one thing to be done: all lithium batteries on the 787 must be swapped out for nickel-cadmium or lead-acid batteries, which have the great advantage that they don’t catch fire.”

If only it were that simple then they’d have solved the problem months ago.

You can’t get NiCd or lead acid batteries of the required power into the space that they’ve designed for the batteries. They’d need to redesign the actual plane to make enough space to get such batteries in there. That’s why they’re all fiddling around with lithium still.

“Edward Tufte’s analysis of how the Challenger ”

??

Feynman and O Rings wasn’t it?

Posted by TimWorstall | Report as abusive

I’m not trying to defend Boeing here, but there are so many technical inconsistencies in this article that it is almost as bad as the 787 electrical system itself.

The APU is NOT a lithium-ion battery. The APU is the auxiliary power unit.

The batteries are not operating in flight, unless there is an electrical failure.

There is lithium-ion battery and lithium-ion battery. Depending on the chemical interface used, they have different flamability. The one used by Boeing was very aggressive. It doesn’t mean that all lithium-ion batteries are bad.

There is not such a thing as a 100% fullproof system. Fuel can explode, composites can and will delaminate, navigation equipment will fail.

All I see here is some free Boeing bashing. Trust me, I’m not a big fan of them, but there are so many technical mistakes in your article that putting a finger at their engineers is quite hilarious.

Posted by F14TCT | Report as abusive

“‘There’s really only one thing to be done: all lithium batteries on the 787 must be swapped out for nickel-cadmium or lead-acid batteries, which have the great advantage that they don’t catch fire.’

If only it were that simple then they’d have solved the problem months ago.

You can’t get NiCd or lead acid batteries of the required power into the space that they’ve designed for the batteries. They’d need to redesign the actual plane to make enough space to get such batteries in there. That’s why they’re all fiddling around with lithium still.”

First of all, all battery types can cause fires. Take your car battery, put a wrench across both terminals, and it will short-circuit with one hell of a flash. Batteries, like fuel tanks, are bombs simply waiting for the proper fuse to come along and detonate them (or, technically, deflagrate them).

The space available is not the issue at this point. Any redesign of the electrical system would require lots of time, with additional time for recertification. And it would lots heavier, cutting into payload. There were lots of problems when Ni-Cad batteries began to be used in aviation as well; now it’s the safe and proven technology.

Posted by bratschewurst | Report as abusive

seanmatthews said

” Edward Tufte’s analysis of how the Challenger disaster happened: engineers being careful, and managers overruling them. ”

Yup. The Li-ion disaster at Boeing is a management disaster, of outsourcing at all costs and shriveling the in-house know-how for the sake of managerial ideology. Nothing to do with engineers.

When engineers are in charge at Boeing, you get the 747 and the B52 (introduced in February 1955 ans still in use and irreplaceable 58 years later). When managers run the show, you get nightmares like the “Dreamliner”.

Posted by Frwip | Report as abusive

After reading the first few sentences, I can’t take this article seriously. An APU is not a battery…not even close. The APU is a small gas turbine engine that drives an electric generator that feeds the electric bus.
Typical half-a$$ed journalism. 5 minutes of research would have made you look like you actually tried on this article.

Posted by otherelbow | Report as abusive

must be nice to be FS, can criticize people, and no one around with any real knowledge to fight back….

If you had bothered to spend about 5 seconds reading up, you would discover that the last time new battery technology was brought on board – when lead acid was replaced by nicad – there were all sorts of problems; thats what happens with new technology.
Planes used to fall out of the sky regularly; recently, disasters have been few so we are accustomed to a much lower failure rate

know what etops means ?
the LI batteries are important because they allow thie plane to fly a route where they can be up to 5 hours from a airport.
if you bother to look at a globe you see that , esp in the pacific, this opens up all sorts of new routes for this plane..which is a big point to the customers, with theability to fly a route where the plane is 5 hours from any availalbe runway, they can do all sorts of new direct flughts…

Posted by ezra567 | Report as abusive

@KenG_CA: “Obviously, they cannot calculate the possibility of a fire from some unknown cause.” But they did so caluclate, and they calculated that possibility of the unknown at Zero.

“But that probability is most likely valid” The difference between never/1-in-a-billion and 1-in-29,000-hours is not a matter a confusion of statistical probability for observed occurrences, it is a difference between “most likely valid” and “not valid at all”. That is, between right and demonstrably wrong.

@TimWorstall:

Fenyman did lead the overall Challenger review (a national treasure, he was). Edward Tufte later expanded on Feynman’s description of the O-ring decisions, and focused on the failures of engineering communications methods that lead to the bad decisions to begin with. Tufte showed the charts created by Morton about the O-rings, and used by NASA engineers to overrule a few Morton engineer’s objections to launch, were terrible and misleading. Literally, a better chart would have probably saved the Challenger.

Posted by SteveHamlin | Report as abusive

Steve, it’s not that they calculated the possibility of a fire from some unknown cause to be zero, they were merely stating they didn’t know of any other way the batteries could catch fire:

“determined that overcharging was the only known failure mode that could result in cell venting with fire”.

It’s not a calculation of a probability, it’s just them saying that the only way they think batteries could catch fire is if they overcharged them.

Posted by KenG_CA | Report as abusive

This post seemed real interesting until I saw that Taleb was being appealed to for authority. Then I knew it must be factually wrong.
Then I learned from comments that this battery is only used on the ground, that the managers overruled the engineers, and it was not engineers who over measure but managers who overrule.
Typical. Once a Taleb point is made the post must be wrong. That is a good mechanical observation.

Posted by bwickes | Report as abusive

easy, stylish and restrained-to-wear collection for classy women.This year, discover an ultra-functional tote plus a small messenger travelling bag inside a new mocha colorway.

Pugh didn’t find your first four quests yr after since coping with offseason neck expensive costly technique, Also he was quoted saying he will be at 100 p. C truth he to be able to work on becoming much better to meet how much national football league level of contest. Pugh little arm holes(Even if really by about a millimeter, Reese announced) Are in addition, flagged, However, Reese wanted to say that the individual searched pictures and can even not realise that is issues..

You made some nice points there. I did a search on the topic and found a good number of people will consent with your blog.

I discovered your blog web site on google and test a number of of your early posts. Continue to maintain up the very good operate. I just further up your RSS feed to my MSN Information Reader. Searching for ahead to studying extra from you in a while!

This is very interesting, You’re a very skilled blogger. I’ve joined your rss feed and look forward to seeking more of your excellent post. Also, I have shared your web site in my social networks!