Forget Instagram’s billion-dollar payday. Forget IPOs, past and future, from Facebook, Groupon, LinkedIn and the like. And ignore, please, the online ramblings of attention-hungry venture capitalists and narcissistic Silicon Valley journalists with the off-putting habit of making their inside-baseball sound like the World Series. Their stories, to paraphrase Shakespeare, are tales told by idiots, full of sound and fury, but signifying very little about the impact of technology on most of our lives. (Sure, some of their tales are about great fortunes, but those are only for a select few; to summon the Oracle of Omaha rather than the Bard of Avon, only a fool ever equated price with value.) Their one-in-a-million windfalls are just flashes in the pan. Or, actually, they are solitary data points, meaningless when devoid of context.
That context is here. It’s come, in part, because of the cunningly simple social and curatorial tools that media companies like Twitter, Tumblr, Facebook and Pinterest give away to their users. But making sense of our social world is only possible with the the tools and technology behind what we call Big Data. The massive information collections spawned by our digital world are too big to address directly, so smart scientists have used fast computers to carve the data into real knowledge. This is how Big Data is already changing the way the world works.
But Big Data is young; though there are hundreds of accessible data sets already, there are still many more chaotic stores of information its tools can tame. Take, for example, social media: Yesterday, social media API company Gnip announced that it is providing customers with all of Tumblr’s data, what in techspeak is called the firehose. What Gnip and competitors like DataSift are providing to customers are Social Big Data firehoses that can be perfectly filtered into gently babbling brooks lined with digital gold nuggets. When the tech media wonder out loud how social companies will ever make a buck – sifting the gold out of their user-generated content is a huge piece of the puzzle.
At Gnip, Tumblr joins Twitter, WordPress, Disqus and the Chinese microblogging service Sina Weibo as the latest tree in a forest of Social Big Data accessible via API. A well-written API can transform a jumble of numbers into a perfectly organized multiplication table – on the order of millions or even billions of complex data pieces. (See this recent Economist visualization of the data record of a single tweet for more context.)
The data pieces are valuable, but not solely because they help advertisers sell more widgets: In an email, Gnip Chief Operating Officer Chris Moody explained one of the coolest uses of data his company has enabled may have actually helped firefighters do their job better: “During the 4 Mile Canyon Fire in Boulder in 2010, [Gnip customer] VisionLink was able to provide fire crews and managers a realtime view into what was happening on the ground by layering geo-tagged Tweets and Flickr images onto a Google map of the area.”