With the spiraling interest in analyzing social media content, academics are applying their quantitative skills to discover patterns in Twitter that are predictive of financial markets.
A University of California, Riverside professor and several other researchers have developed a model that uses data from Twitter to help predict the traded volume and value of a stock the following day, according to the university’s announcement.
The trading strategy is based on a model created by Vagelis Hristidis , an associate professor at the Bourns College of Engineering, one of his graduate students and three researchers at Yahoo! in Spain. It has outperformed other baseline strategies by between 1.4 percent and nearly 11 percent and also did better than the Dow Jones Industrial Average during a four-month simulation, according to a release summarizing the results.
“These findings have the potential to have a big impact on market investors,” commented Hristidis, who specializes in data mining research, which focuses on discovering patterns in large data sets. “With so much data available from social media, many investors are looking to sort it out and profit from it.”
In fact, Wall Street’s interest in analyzing Big Data for predicting financial markets is already a hot trend. Some of the largest quant hedge funds, the likes of Renaissance Technologies, D.E. Shaw and others are said to be spending millions (if not billions) on building tools for analyzing unstructured data found on Twitter and Facebook. Big data companies like Thomson Reuters and Dow Jones are offering products and entire business units around interpreting sentiment analysis to produce trading signals.
Last spring, UK hedge fund Derwent Capital received heavy media coverage for revealing it was about to wager $40 million on a strategy using Twitter to predict which way the stock market was going to move. According to an article in Advanced Trading, Derwent was using an algorithm developed by researchers at Indiana University and the University of Manchester, to perform sentiment analysis on whether equities were likely to rise or fall. But since going live with its Twitter-based algorithm, there haven't been any press releases touting its returns, noted David Leinweber, author of "Nerds on Wall Street", at a recent trade show.
Some market practitioners have expressed skepticism toward using Tweets as a predictor for stock prices, because it’s difficult to get the context from only 140 characters. They questioned whether tweets are reliable as a sole forecaster of stock market movements. John Bates, founder and chief technology officer of Progress Software, told AT that while Twitter can effectively measure the public mood, real-time events are simply too unpredictable and he questioned whether trends that lasted three or four days in a row, could end up being more of a lagging — as opposed to a leading — indicator.
However, the strategy developed by UC Riverside’s Hristidis and his research associates, goes beyond public sentiment analysis and appears to break new ground on several fronts.
Hristidis and his co-authors set out to study how activity in Twitter is correlated to stock prices and traded volume. While past research has looked the sentiment, positive or negative, of tweets to predict stock price, little research has focused on the volume of tweets and the ways that tweets are linked to other tweets, topics or users. Further, past work has mostly studied the overall stock market indexes, and not individual stocks.
They obtained the daily closing price and the number of trades from Yahoo! Finance for 150 randomly selected companies in the S&P 500 Index for the first half of 2010. Then, they developed filters to select only relevant tweets for those companies during that time period. For example, if they were looking at Apple, they needed to exclude tweets that focused on the fruit.
They expected to find the number of trades was correlated with the number of tweets. Surprisingly, the number of trades is slightly more correlated with the number of what they call “connected components.” That is the number of posts about distinct topics related to one company. For example, using Apple again, there might be separate networks of posts regarding Apple’s new CEO, a new product it released and its latest earnings report.
During the time period they simulated investments, between March 1, 2010 and June 30, 2010, the model the researchers developed using Twitter data lost on average 2.4 percent –while over the same time period, Dow Jones Industrial Average fell 4.2 percent. Other strategies they tested, including a random model that bought stocks every day, vs. a fixed model had average losses of 5.5 and 3.8 percent, respectively.
While it remains to be seen whether Twitter-based models can generate profits over time — and understandably, market gurus may be skeptical about the reliability of real-time tweets — analyzing social media is starting to bare fruit.