Peter Gloor does not own a crystal ball. Nor does he read tea leaves, the lines on your hand or claim to speak to people “from the other side”. Yet, the research scientist at the Massachusetts Institute of Technology (MIT) claims he can predict the future.
Take the ongoing Republican nomination race. Late last year, Newt Gingrich was surging in the polls and some pundits thought he may well overtake frontrunner Mitt Romney. But Gloor predicted he would not.
It was true that chatter on Twitter seemed to be giving an edge to the former House speaker, but analysing edits to Wikipedia, Gloor predicted that Romney would beat him. Gloor ended up being right: Romney beat Gingrich by a wide margin on the night.
Of course, you could argue his prediction was just a fluke. But Gloor says his forecasts have worked time and again. In some cases, his analysis can even help predict electoral results where polls fail. For example, in 2009 Switzerland voted to bar the construction of new minarets on mosques. The polls wrongly indicated that people would reject the measure, while Gloor’s analysis, based on an analysis of social media, correctly predicted it would pass. “People lied to the pollsters, because they didn’t want to appear racist,” says Gloor.
His research is part of a growing body of scientific work, often called predictive analytics, which involves using software and computer algorithms to mine emails, social media and other public websites to help make predictions about the future. It is a field that has caught the attention of everyone from movie moguls, who want help identifying which films will be popular, to captains of industry, who want advance information on which stocks are winners. And for those interested in politics, it is being used to forecast who will win in upcoming elections.
‘Enter the Swarm’
The idea of mining publicly-available data to help predict the future is nothing new, but past efforts focused primarily on the news media. In the run-up to World War II, the US and British governments monitored world media to help predict the course of hostilities, and during the Cold War, the US government sponsored academics to come up with mathematical models that would analyse media reports to help predict the actions of the Soviet Union.
Over the past decade, however, the advent of social media sites like Facebook and Twitter has created a real-time flood of news about the world. Kalev Leetaru, a computer scientist at the University of Illinois, describes social media as something akin to “a gateway for humanity” because of the sheer abundance of data.
“There are three billion items posted to Facebook every day and 200 million tweets,” he says. “One of my favourite figures is that right now, every day, there are more words posted to Twitter than were in the New York Times over the last 60 years.”
The trick is in understanding what that data means, and which data is important for what topics.
For example, Gloor says that Twitter is very good for predicting behaviour that is influenced by the crowd – the general public. When it comes to seeing a new film, for example, people are heavily influenced by what the crowd seems to say: is it a good or bad movie?
It is a finding that has been put to the test by computer scientist Hsinchun Chen at the University of Arizona. Chen and his group have looked at data from hundreds of Hollywood movies and have attempted to correlate ticket sales with online data generated by social media users. The model worked “beautifully”, suggests Chen.