The problems with measuring traffic congestion

By Felix Salmon
October 17, 2012
a cautious welcome to TomTom's congestion indices.

" data-share-img="" data-share="twitter,facebook,linkedin,reddit,google" data-share-count="true">

Back in July, I gave a cautious welcome to TomTom’s congestion indices. The amount of congestion in any given city at any given time does have a certain randomness to it, but more data, and more public data, is always a good thing.

Or so I thought. I never did end up having the conversation with TomTom that I expected back in July, but I did finally speak to TomTom’s Nick Cohn last week, after they released their data for the second quarter of 2012.

In the first quarter, Edmonton saw a surprisingly large drop in congestion; in the second quarter it was New York which saw a surprisingly large rise in congestion. During the evening peak, the New York congestion index was 41% in the first quarter; that rose to 54% in the second quarter, helping the overall New York index rise from 17% to 25%. (The percentages are meant to give an indication of how much longer a journey will take, compared to the same journey in free-flowing traffic.) As a result, New York is now in 8th place on the league table of the most congested North American cities; it was only in 15th place last quarter, out of 26 cities overall.

So what’s going on here? A congestion index like this one serves two purposes. The first is to compare a city to itself, over time: is congestion getting better, or is it getting worse? The second is to compare cities to each other: is congestion worse in Washington than it is in Boston?

And it turns out that this congestion index, at least, is pretty useless on both fronts. First of all there are measurement issues, of course. Cohn explained that when putting together the index, TomTom only looks at road segments where they have a large sample size of traffic speeds — big enough to give “statistically sound results”. And later on a spokeswoman explained that TomTom’s speed measurements turn out to validate quite nicely with other speed measures, from things like induction loop systems.

But measuring speed on individual road segments is only the first step in measuring congestion. The next step is weighting the different road segments, giving most weight to the most-travelled bits of road. And that’s where TomTom data is much less reliable. After all, on any given stretch of road, cars generally travel at pretty much the same speed. You can take a relatively small sample of all cars, and get a very accurate number for what speeds are in that place. But if you want to work out where a city’s drivers drive the most and drive the least, then you need a much larger and much more representative sample.

And this is where TomTom faces its first problem: its sample is far from representative. Most of it comes from people who have installed TomTom navigation devices in their cars, and there’s no reason to believe those people drive in the same way that a city’s drivers as a whole do. Worse, most of the time TomTom only gets data when the devices are turned on and being used. Which means that if you have a standard school run, say, and occasionally have to make a long journey to the other side of town, then there’s a good chance that TomTom will ignore all your school runs and think that most of your driving is those long journeys. (TomTom is trying to encourage people to have their devices on all the time they drive, but I don’t think it’s had much success on that front.)

In general, TomTom is always going to get data weighted heavily towards people who don’t know where they’re going — out-of-towners, or drivers headed to unfamiliar destinations. That’s in stark contrast to the majority of city traffic, which is people who know exactly where they’re going, and what the best ways of getting there are. There might in theory be better routes for those people, and TomTom might even be able to identify those routes. But for the time being, I don’t think we can really trust TomTom to know where a city as a whole is driving the most.

I asked Cohn about the kind of large intra-city moves that we’ve seen in cities like Edmonton and New York. Did they reflect genuine changes in congestion, I asked, or were they just the natural variation that one sees in many datasets? Specifically, when TomTom comes out with a specific-sounding number like 25% for New York’s congestion rate, how accurate is that number? What are the error bars on it?

Cohn promised me that he’d get back to me on that, and today I got an email, saying that “unfortunately, we cannot provide you with a specific number”:

The Congestion Index is calculated at the road segment level, using the TomTom GPS speed measurements available for each road segment within each given time frame. As the sample size varies by road segment, time period and geography, it would be impossible to calculate overarching confidence levels for the Congestion Index as a whole.

It seems to me that if you don’t know what your confidence levels are, your index is pretty much useless. All of the cities on the list are in a pretty narrow range: the worst congestion is in Los Angeles, on 34%, while the least is in Phoenix, on 12%. If the error bars on those numbers were, say, plus-or-minus 10 percentage points, then the whole list becomes largely meaningless.

And trying to compare congestion between cities is even more pointless than trying to measure changes in congestion within a single city, over time. As JCortright noted in my comments in July, measuring congestion on a percentage basis tends to make smaller, denser cities seem worse than they actually are. If you have a 45-minute commute in Atlanta, for instance, as measured on a congestion-free basis, and you’re stuck in traffic for an extra half an hour, then that’s 67% congestion. Whereas if you’re stuck in traffic for 15 minutes on a drive that would take you 15 minutes without traffic, that’s 100% congestion.

Cohn told me that TomTom has no measure of average trip length, so he can’t adjust for that effect. And even he admitted that “comparing Istanbul to Stuttgart is a little strange”, even though that’s exactly what TomTom does, in its European league table. (Istanbul, apparently, has congestion of 57%, with an evening peak of 125%, while Stuttgart has congestion of 33%, with an evening peak of 70%.)

All of which says to me that the whole idea of congestion charging has a very big problem at its core. There’s no point in implementing a congestion charge unless you think it’s going to do some good — unless, that is, you think that it’s going to decrease congestion. But measuring congestion turns out to be incredibly difficult — and it’s far from clear that anybody can actually do it in a way that random natural fluctuations and errors won’t dwarf the real-world effects of a charge.

When London increases its congestion charge, then, or when New York pedestrianizes Broadway in Times Square, or when any city does anything with the stated aim of helping traffic flow, don’t be disappointed if the city can’t come out and say with specificity whether the plan worked or not. Congestion is a tough animal to pin down and measure, and while it’s possible to be reasonably accurate if you’re just looking at a single intersection or stretch of road, it’s basically impossible to be accurate — or even particularly useful — if you’re looking at a city as a whole.

More From Felix Salmon
Post Felix
The Piketty pessimist
The most expensive lottery ticket in the world
The problems of HFT, Joe Stiglitz edition
Private equity math, Nuveen edition
Five explanations for Greece’s bond yield
Comments
5 comments so far

Agreed on your criticisms of TomTom’s methodology (which is frankly laughable), Felix, but I think you go wrong in then concluding that it’s extremely difficult/impossible for cities to measure congestion effectively.

The road-embedded piezo-electric sensors that Los Angeles and many other cities have (a technology that dates all the way back to the 1950′s), which detects every time a vehicle rolls over it, is quite effective at measuring traffic flow — given that its installed at most every intersection controlled by a stoplight and there are reams of historical data on it.

Certainly, knowing how many cars are transversing every intersection in real time is meaningful. As we know, LA stalls to a crawl during rush hour (and the city knows exactly what degree of “crawl” based on its historical sensor data). Therefore, any congestion pricing which then increased the number of cars crossing a given intersection — to a statistically significant degree — would be both easily measured and methodologically sound.

It’s not as though cities rely on semi-used Tom Toms to measure data.

Posted by LA_Banker | Report as abusive

I think CA DoT is using data collecte using people’s FasTrak passes — which are a heck of a lot more common than a TomTom — to look at how long it takes commuters to traverse particular road segments. You don’t get continuous position and speed data, but I think they’ve deployed detectors at each exit off major highways, so they can tell where you got on and where you got off, and how long it took; with large numbers of people having the electronic toll device, coming on and off various exits constantly throughout rush hour, that should work pretty well for giving you a sense over time of the trip-length for each exit pair. And of course, since we usually care the most about congestion in relation to commuting, this is precisely the group of people you want to be studying.

Maybe you could see if they’ll give you some of their data.

In re: TomTom, I have to say that, at least based on experience, tagging LA as congested and PHX as less-so certainly strikes me as correct. (My folks live in PHX, and I have a bunch of friends down in LA, and it’s definitely worse down there than here in SF.)

It seems like the correct measurement of congestion is probably aggregate person-hours lost to traffic * the average wage for local commuters.

Posted by Auros | Report as abusive

I don’t understand the big dilemma… If you want to measure rush-hour congestion on a certain road, install cameras to measure the flow of traffic. As best I can tell, they are already in place and working reasonably well here. (They are used to give travel times.)

You don’t need to compare the road to some random highway in Albuquerque New Mexico. You don’t need to determine whether congestion has increased or decreased this year. You can measure it directly.

Determining the effectiveness of a policy is a trickier question, especially since properly controlled experiments are impossible. Still, any effective policy ought to have a sufficiently noticeable effect that you don’t need a careful study to detect it.

If you put a congestion charge on the Boston highways, I suspect they would be in effect from 6 am to 9 pm daily. Which wouldn’t be a terrible thing, as it might discourage people from driving into the city.

Posted by TFF | Report as abusive

2 things

1) TomTom could get access to data to help them improve their measurements.

2) I think what they are trying to measure is the wrong thing. The data could be used to make a sensible measure – not necessarily what they want, but one that works with the data.

Of course, they need to pay me the advice.

Big hint, think capacity not congestion.

Posted by tqft | Report as abusive

Auros is right. Between counting cars going past specific points, and accurate point-to-point times, you can make some pretty good estimates of congestion, even if you don’t know the distribution of cars along each route.

Posted by AngryInCali | Report as abusive
Post Your Comment

We welcome comments that advance the story through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can flag it to our editors by using the report abuse links. Views expressed in the comments do not represent those of Reuters. For more information on our comment policy, see http://blogs.reuters.com/fulldisclosure/2010/09/27/toward-a-more-thoughtful-conversation-on-stories/