Pedantry and numeracy in journalism

By Felix Salmon
October 31, 2013

Anthony DeRosa retweeted this photo on Wednesday morning, which came with the caption “Math is difficult for many journalists”. I was genuinely confused: I couldn’t see any math errors in the screenshot. So I asked DeRosa where the error was. He replied:

Just as I couldn’t see a math error, I couldn’t see anything remotely egregious. Thus began quite a long Twitter conversation, large parts of which DeRosa Storified for me. I proved very bad at getting my point across in tweets, so I promised to explain everything in this post.

The problem that DeRosa had with the stories about the Norwegian man with bitcoin, it turns out, was that they didn’t agree on exactly how many dollars’ worth of bitcoin he bought back in 2009. Some said $22, some said $26, some said $27. That discrepancy, in and of itself, was proof enough, for DeRosa, that many journalists were committing an “egregious error”.

Now the facts of the story were not in dispute at all. The Norwegian man spent 150 Norwegian krone on bitcoin in 2009 while writing a thesis on encryption, forgot about them, and then, in April 2013, during full bitcoin fever, discovered that his digital wallet contained coins worth some 5 million krone. Nice! In dollar terms, his investment went from being worth about $25 to being worth about $900,000.

But DeRosa wanted to know exactly how much the coins were worth at purchase: if one journalist said $22 and another said $26, then at least one of them, and possibly both, were, in his eyes, clearly wrong. You needed to be looking at multiple versions of the story to even see that there was a disparity here — but that’s exactly what DeRosa was doing. And rather than simply ask why there was a disparity, he decided that the individual journalists were doing something very bad.

It turns out that the reason for the disparity is very simple: the dollar-krone exchange rate fluctuated quite a lot in 2009, and it was unclear exactly when the bitcoins were purchased, so no one knows exactly how much the coins were worth, in dollar terms, when purchased. They might have been worth $22, or they might have been worth $27. Really, it doesn’t make any difference: the man made a profit of well over $850,000 whatever his initial investment was.

But there’s a superficial exactness to numbers that doesn’t exist in words, and so people have a tendency to believe that all numbers are much more precise than in fact they are. If the Labor Department releases a report saying that payrolls rose by 148,000 in September, then a reporter who said that payrolls rose by 150,000 would be considered to have her facts wrong — even though the headline number is only accurate to within 100,000 people either way. The actual number of new jobs could easily be anywhere between 44,000 and 252,000 — and indeed there’s a 5% chance that it’s outside even that large range. But because everybody insists on one hard number, one hard number is what they get.

One of the most important skills in financial journalism is numeracy — having a basic feel for numbers. In this case, the reporters covering the story got the numbers right: they should be applauded for that, rather than having brickbats thrown at them. After all, it’s not hard to find examples of reporters getting numbers very wrong. Consider this story, from the New York Post, under the headline “Verizon increases cell bills 7.1% for 95M customers”:

Verizon didn’t sign up as many new cell phone customers in the third quarter as Wall Street expected — but it still earned more than forecast as it managed to increase the average bill of its 95.2 million wireless customers by 7.1 percent.

The average Verizon Wireless bill jumped to $155.75 a month as of Sept. 30 from $154.63 last year, the company said Thursday.

Now that is a math error — and evidence of deep innumeracy on the part of the journalist who wrote it, as well as a whole series of editors. If you want to work out exactly what the increase is, in percentage terms, of going from $154.63 to $155.75, then you might need a calculator. But if you were numerate, you would know intuitively that it’s very small, on the order of 1%, and that it’s nowhere near 7%. If you get a result of 7.1%, then that means you’ve pressed a wrong button somewhere, and you should do your sums again.

The problem is that we naturally associate numbers with mathematics, and mathematics with accuracy — and we therefore assume that whenever we see a number, we’re dealing with something which is either right or wrong — just as it was in elementary-school arithmetic. When numbers describe the real world, however, they always have error bars; they’re basically shorthand for a probability distribution. So long as the number that’s printed is plausibly somewhere reasonably likely to be in the fat bit of the distribution, it doesn’t make sense for critics like DeRosa to call it out for being inaccurate. After all, pretty much all numbers are inaccurate, especially if you’re trying to measure something (like the value of a certain number of bitcoins) in terms of something else (like dollars). Journalists should work on the basis of the identity of indiscernibles: so long as the meaning of the story isn’t changed, the exact number being used really doesn’t matter.

Let’s say that you saw various news reports about an event, and that different words were used to describe the weather: some said it was “cold”, others “brisk”, others “frosty”, others “wintry”, and so on. You wouldn’t raise an eyebrow: you’d see that they were all describing the same thing, in slightly different language, and you wouldn’t demand an explanation for the “discrepancy”. Well, numbers in news articles behave like words: they’re trying to describe the state of the world. That’s why the NYT has banned the use of “record” or “largest” unless inflation is taken into account. What matters is not the mathematical relationship between abstract numbers, but rather the state of the world that is being described.

In the case of the bitcoin, there was never any doubt about what was being described, and so the journalism did exactly what it was meant to do. There are far too many real problems with genuinely flawed news articles for critics to start playing “gotcha” whenever they see a couple of numbers which say exactly the same thing, even if they’re not mathematically identical.

More From Felix Salmon
Post Felix
The Piketty pessimist
The most expensive lottery ticket in the world
The problems of HFT, Joe Stiglitz edition
Private equity math, Nuveen edition
Five explanations for Greece’s bond yield
Comments
8 comments so far

I agree that the difference between $22 and $27 in the story is inconsequential, but not all numbers are probablistic. Results from sampling are probablistic. Estimates are probablistic. But I spent $16.50 on dinner – not $16.5000001, not $16.00 +/- 0.5. There are plenty of unambiguous, absolute numbers in the real world.

In the Bitcoin story, the uncertainty doesn’t stem from anything mathematical – it’s just because he doesn’t remember the date, and journalists couldn’t really do anything about it. But where they can, they should, right? After all, I can spell “Barak Obama” wrong and it doesn’t really change the meaning, but it is nonetheless reasonable to strive for accuracy wherever you can.

Posted by keith_ng | Report as abusive

I agree that the difference between $22 and $27 in the story is inconsequential, but not all numbers are probabilistic. Results from sampling are probabilistic. Estimates are probabilistic. But I spent $16.50 on dinner – not $16.5000001, not $16.00 +/- 0.5. There are plenty of unambiguous, absolute numbers in the real world.

In the Bitcoin story, the uncertainty doesn’t stem from anything mathematical – it’s just because he doesn’t remember the date, and journalists couldn’t really do anything about it. But where they can, they should, right? After all, I can spell “Barak Obama” wrong and it doesn’t really change the meaning, but it is nonetheless reasonable to strive for accuracy wherever you can.

Posted by keith_ng | Report as abusive

I agree that the difference between $22 and $27 in the story is inconsequential, but not all numbers are probabilistic. Results from sampling are probabilistic. Estimates are probabilistic. But I spent $16.50 on dinner – not $16.5000001, not $16.00 +/- 0.5. There are plenty of unambiguous, absolute numbers in the real world.

In the Bitcoin story, the uncertainty doesn’t stem from anything mathematical – it’s just because he doesn’t remember the date, and journalists couldn’t really do anything about it. But where they can, they should, right? After all, I can spell “Barak Obama” wrong and it doesn’t really change the meaning, but it is nonetheless reasonable to strive for accuracy wherever you can.

Posted by keith_ng | Report as abusive

“So long as the number that’s printed is plausibly somewhere reasonably likely to be in the fat bit of the distribution, it doesn’t make sense for critics like DeRosa to call it out for being inaccurate.”

I tend to argue that as long as you’ve got the first digit correct and the correct number of digits then you’re doing well enough.

Certainly it’s a standard high enough that large portions of the press don’t manage to achieve…..

Posted by TimWorstall | Report as abusive

Surely it is of note that Mr. De Rosa’s employer is called Circa. WIth his zeal for hyperdeterminism, does it not trouble him that his own company’s very name is shorthand for approximation?

Posted by 12thStDavid | Report as abusive

In my back and forth with Felix on Twitter I already said a lot of what Felix wrote here in his blog:

I’m not asking for exactness, I was asking for transparency of how they got to their number. I also said it only matters when the margin of error changes the context of the story.

Those little facts might have got in the way of what was still a very interesting blog post!

You can go back and check the Storify that Felix linked to and see for yourself.

Posted by Soup | Report as abusive

i’m only a dog (iOAD™) but Someone told me a joke He heard when He worked for a large company. His boss, when giving a presentation, used to tell two jokes to find out whether the audience were technical people or businessmen. (There were no busnesswomen in those days — sorry.) If they laughed at one joke, they were scientists. If they laughed at the other, they were businessmen. If they laughed at both, it was a mixed audience. (He never said what happened if they didn’t laugh at all.) Anyway, here’s the one businessmen would laugh at.

There was a guy whose business was buying companies and making them more profitable. He visited a small company whose owner was retiring with the idea of making an offer. He examined the books. Last year: fabulous profits. The year before: fabulous profits. And so on as far back as he could see. So he said to the owner, “You have been amazingly successful for many years. I’m really impressed. How do you do it?”

His reply was, “Well, I make ‘em for 3 cents and sell ‘em for a nickel, and you’d be amazed how that two percent adds up.”

Posted by samadamsthedog | Report as abusive

“Let’s say that you saw various news reports about an event, and that different words were used to describe the weather: some said it was “cold”, others “brisk”, others “frosty”, others “wintry”, and so on. You wouldn’t raise an eyebrow: you’d see that they were all describing the same thing, in slightly different language, and you wouldn’t demand an explanation for the “discrepancy”. Well, numbers in news articles behave like words: they’re trying to describe the state of the world.”

THIS.

As a scientist, I see the same thing in science journalism. Science is actually mixed here: sometimes that third decimal place is very important, and other times it doesn’t matter at all. The key is knowing which is which; when the important details are qualitative and when they’re quantitative.

So what I see in popular arguments over (particular controversial science subjects), I see a lot of people being pedantic over some number that doesn’t matter that much, or *not* being pedantic over a one that does. The truth is, the layperson can’t know the difference, because they don’t have the depth of knowledge in the subject or the experience with science to know what’s important and what’s not.

And there’s another connection with what you’re saying, Felix: Many people have this expectation that science (like math) is all about that 3rd, 5th, or Nth decimal place, and if the scientist gets that decimal place wrong, well, then, their conclusions are also all “wrong”. That’s generally really not the case, but again, it comes from a lack of actual experience with math and science. And the layperson’s tendency is definitely towards pedantry, rather than actually understanding the (generally qualitative) picture of what’s actually happening in a physical system.

Pedantry where it doesn’t belong: it has the side effect of working quite well with one’s confirmation bias, since it allows you considerable flexibility in what kind of data you accept as true, or not.

Posted by Windchasers | Report as abusive
Post Your Comment

We welcome comments that advance the story through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can flag it to our editors by using the report abuse links. Views expressed in the comments do not represent those of Reuters. For more information on our comment policy, see http://blogs.reuters.com/fulldisclosure/2010/09/27/toward-a-more-thoughtful-conversation-on-stories/