Twitter datapoint of the day

By Felix Salmon
November 17, 2010

I work for a global information company which makes billions of dollars a year selling valuable data to banks, hedge funds, and other people in the financial markets, often at very high prices: $2,000 a month or even more.

And then there’s Twitter, which jealously guards access to its full stream of tweets (roughly 1,000 per second, these days). As of now, however, it’s signed a deal with Gnip whereby you can get a randomly-selected 50% of those tweets for $360,000 a year, which works out at $30,000 a month. You’re not allowed to republish them, but that’s OK—the people willing to spend that kind of money are likely to be high-frequency trading shops who want to keep the data as private as possible in any case.

I don’t have a problem with Twitter monetizing my public tweets in this manner; as I understand it, DMs aren’t included, and neither are any tweets from protected accounts. But it’s quite astonishing how much those tweets are worth, when they’re aggregated into a fat pipe. And it’s also interesting to me how much more 50% of the full stream is worth than 5%, which you can get for just $5,000 a month. Given the rapidly-diminishing marginal returns of each additional Twitter stream, I wonder where the added value comes from. I’d imagine that if a topic starts trending on the 50% feed, it will almost certainly be trending on the 5% feed as well.

I do, on the other hand, have a problem with other sites—Facebook in particular—monetizing my private information. I worried that Mint might be doing that kind of thing back in March, and in general if any website wants to sell any information of mine which isn’t public, I want them to ask my permission first. As Twitter shows, aggregated user data can be very valuable indeed. And with that kind of money on the table, there’s a lot of incentive to be ethically flexible.

More From Felix Salmon
Post Felix
The Piketty pessimist
The most expensive lottery ticket in the world
The problems of HFT, Joe Stiglitz edition
Private equity math, Nuveen edition
Five explanations for Greece’s bond yield
Comments
7 comments so far

Is the 5% 5% of ALL tweets? or the tweets of 5% of users at a given time. Does it include retweets, which i imagine is quite valuable to know?

I alluded, when you were casting around for post ideas, to role of such social media in news propagation. Certainly, a news story that certain people break is far more likely to make it into a echo chamber and be market moving. If I was an HFT shop, I’d want the feeds off them.

Posted by Danny_Black | Report as abusive

guess i won’t be entering any block equity orders into twitter before getting them filled!

Posted by q_is_too_short | Report as abusive

You completely misunderstand how HFT works. News streams may form part of the input for algorithmic trading strategies, some of which operate at high frequencies, but most HFT is done without regard either to fundamentals or to news of any sort. Indeed, most traders tend to disparage news-sourced automated trading strategies because they are unproven at best.

If you’d like to talk about HFT some time, drop me an email.

Posted by DrWex | Report as abusive

Hi Felix,
first off, a disclaimer. As a fellow data geek in Boulder I’m friends with the Gnip guys, but I have no financial or business relationship with them, and no access to non-public information on their business model. I also did one of the RWW articles about this announcement:
http://www.readwriteweb.com/hack/2010/11  /why-is-twitter-partnering-with-gnip.ph p

So, with all that out of the way, I think you’re wrong, or at least ahead of your time, when you assume HFT’s are the customers who are paying for the half fire hose. Their customers are cagy about having their names released, but from knowing the market most of them are going to be firms re-selling social media monitoring services to large brands. They take in the stream, and pull out mentions of hundreds or thousands of different brands that they’re monitoring. Each end-user may only care about a dozen brands and be willing to pay 10 or 20k a year for it, but the monitoring firms can use the feed they get from Gnip to offer this to as many clients as they can sign up. I have zero idea if they’re actually customers, but think of companies like Radian6. It’s also the same problem Cisco’s new Social Miner is trying to solve: http://www.cisco.com/en/US/products/ps11 349/index.html

It totally makes sense that HFT shops *should* be front-running Twitter data, what’s surprised me has been how little progress the few folks I’ve had contact with have made. After I attempted a release of public Facebook data to academics ( http://petewarden.typepad.com/searchbrow ser/2010/02/how-to-split-up-the-us.html ) I was approached by some of these companies. They were mostly data-driven hedge funds as far as I can tell, using traditional tools like polls, surveys and focus groups to predict consumer trends and then trade on that basis. They were intrigued by the idea of extending it to cover social media, but pretty hesitant. I think this Quora thread captures a lot of the reasons:
http://www.quora.com/Stock-Market/Can-Tw itter-sentiment-analysis-guide-stock-mar ket-investment

There’s evidence from the paper mentioned at the beginning that Twitter is currently a good leading indicator, but it’s also trivial to imagine how to game it once people began to rely on it.

Anyway, this comment is far too long already, so I won’t belabor the point, but this is a fascinating area with some non-obvious problems. I’d be happy to geek out about this further off-line if you want to ping me.

Posted by petewarden | Report as abusive

@DrWex –

Right, HFT does not rely on any real-world events, but mainly instantaneous pricing data that allows practitioners to identify block trades and front-run them.

This is completely unethical, whether or not unethical lawyers are able to justify it. If this is your profession, my question would not be, how does it work? My question would be, how do you sleep at night?

Posted by DanHess | Report as abusive

@DanHess, I’ve heard several answers to that one.

“I work hard and am paid well. Ethics? What is unethical about working hard and getting paid?”

“HFT more than doubles the trading volume of the major stocks. This provides VALUABLE LIQUIDITY, allowing you to sell 100,000 shares in five seconds. Without HFT, you might have to spread out that order over five minutes. Of course this liquidity disappears the moment the market needs it the most, but heck, nobody’s perfect!”

“The middle-man has always taken a cut of the action. We don’t take any larger a cut per transaction than happened thirty years ago. In fact by doubling the number of transactions, we can claim that our cut per transaction is SMALLER. And the spreads are also tight, which is of critical importance if you want to sell something you just bought ten seconds ago.”

Posted by TFF | Report as abusive

You are in fact incorrect in claiming that 5% of the stream is roughly equivalent to 100% of the stream for capturing trends.

Secondly, trends are just one facet of all the interesting things that can be accomplished with the Twitter. For example, if you wanted to – given an arbitrary Twitter id – find out their topics of interest, good luck doing that with 5% of the stream.

Similarly, if you want to build a social media monitoring service (of the kind that Sysomos built and sold successfully last year) and then sell the service to large brands, once again, good luck doing that with 5% of the overall stream.

Lastly, the folks who license the firehose – and that list of companies is easily available via a google search – are inherently uninterested in being a reseller. They are not high-frequency trading shops but are mostly Silicon Valley companies trying to build innovative apps and services on top of this mass volume of data.

Posted by saumil07 | Report as abusive
Post Your Comment

We welcome comments that advance the story through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can flag it to our editors by using the report abuse links. Views expressed in the comments do not represent those of Reuters. For more information on our comment policy, see http://blogs.reuters.com/fulldisclosure/2010/09/27/toward-a-more-thoughtful-conversation-on-stories/