Researchers have developed a model that uses data from Twitter to help predict the volume and value of a stock in an upcoming trading session.

The model was created by three researchers at Yahoo! In Spain, Vagelis Hristidis, an associate professor at the Bourns College of Engineering, at the Unviersity of California in Riverside and one of his graduate students. A trading stragegy based on the model outperformed other baseline strategies by between 1.4 percent and nearly 11 percent, the researchers said.

The model produced results that were superior to the Dow Jones Industrial Average during a four-month simulation, the creators also said.

“These findings have the potential to have a big impact on market investors,” said Hristidis, a specialist data mining research, which focuses on discovering patterns in large data sets.


Hristidis, graduate student Eduardo J. Ruiz, and Carlos Castillo, Aristides Gionis and Alejandro Jaimes, all of whom work for Yahoo! Research Barcelona, presented the findings at the Fifth ACM International Conference on Web Search & Data Mining in Seattle.

Hristidis and his co-authors studied how activity in Twitter is correlated to stock prices and traded volume.  Their work tried to quantify the volume of interest in a stock, where past research looked the sentiment, positive or negative, of tweets to predict stock price. They also tried to link the volume to individual stocks, not indices.

The research focused on the volume of tweets and the ways that tweets are linked to other tweets, topics or users.

They obtained the daily closing price and the number of trades from Yahoo! Finance for 150 randomly selected companies in the S&P 500 Index for the first half of 2010.

Then, they developed filters to select only relevant tweets for those companies during that time period. For example, if they were looking at Apple, they needed to exclude tweets that focused on the fruit.

The number of trades was not directly correlated with the number of tweets on a given company. Instead, the number of trades rises when the subject company is involved in a number of different “connected components,’’ as the researchers put it.

That is to say, multiple events or topics are being discussed about the subject firm.

In the Apple case, that might mean separate series of posts on Apple’s new CEO, a new product and an earnings report.

For the study, the researchers simulated a series of investments between March 1, 2010 and June 30, 2010 and analyzed performance using several investment strategies.  During that time frame, the Dow Jones Industrial Average fell 4.2 percent.


Download the Twitter Model research paper here