In a previous post, I discussed the difference between predicting and forecasting, the latter being what I am most interested in. I showed that by using Twitter as an additional source of data it was possible to forecast polls in the June 12 Ontario election 24-48 hours before they are available. The forecasts derived from the Twitter-based model accurately tracked the aggregate polls.
One question left unanswered in my previous post was whether the inclusion of Twitter-based information was necessary for forecasting the poll results. Could other forecasting techniques that don’t need an outside social media data source like Twitter do just as well? In this post I will answer that question by comparing the Twitter model to a very common (and quite successful) forecasting technique: Autoregression with Moving Average (ARMA).
I analyzed the same data used previously. However, I studied the evidence a posteriori — after the fact. Comparing forecast methods on live data introduces variability in the information. For example, not all of the past polls are always available on a given day. This means that a method like ARMA, that relies solely on historical results, may be at a disadvantage on a day when past data is unavailable. Also, each day, as new poll information comes in, the historical data may change, making it difficult to tell if a forecast improvement is due to the methodology or more historical results being available.
To do this comparison I analyzed the methods a week after the election, using all available poll data. For this reason, the forecasts based on the Twitter data in this study will be different from those of the previous post, which were made using incomplete poll data.
A few technical details for those who are computationally inclined: To keep the comparisons the same, the ARMA model and the Vector Autoregression with eXogenous Variable (VARX) model consider polls from the previous four days in making their predictions (a time lag of four periods). The lagged data was fully weighted in both models. For this reason, the VARX and ARMA models should track each other well, in the absence of any real information from the Twitter variable. When there is a substantial difference between the methods, this can be attributed solely to information from Twitter.
The graphs show that the aggregate polls and Twitter are closer than ARMA, but that seems to say that a poll of polls would be as good as Twitter? Figure 1 summarizes both forecasting techniques against the aggregate polls. Both forecasting models track the actual polls quite well. The largest deviation between the ARMA model and the VARX model (Twitter forecast) is around June 3, the day of the Ontario leaders debate. The debate resulted in an popularity increase for the Progressive Conservatives (PC) and a drop for the Liberals. Prior to the debate, the Liberals had been steadily climbing in the opinion polls, while the PCs were declining.
The ARMA model forecasts that the Liberals and PC stay essentially the same. This is expected since the ARMA model has only historical information and cannot “foresee” a change related to an event, such as the debate. This is where a correction from an outside variable, such as Twitter, might have the most affect. Sure enough, we see that the Twitter model adjusts the ARMA forecast for the Liberals down and the the PC up in almost the exact amount as the polls.
We can compare the actual error for each forecasting method for each day (Figure 2). Most days, the error in the methods is roughly the same, demonstrating that most of the time the Twitter-infused model mirrors the standard ARMA forecast. Around the June 3 leaders debate, the error in the Twitter forecast is much lower than the ARMA forecast. This shows that when an event significantly changes poll values, the Twitter forecast is much more accurate than the standard ARMA forecast.
ARMA forecasting is a standard, accepted and accurate technique for extrapolating poll results into the future. This is most often used to estimate a poll value before the actual data is available. However, ARMA works only on historical data — events happening in real time, like a debate, can significantly change poll values in way that are undetectable by an ARMA forecast.
Adding additional information to an ARMA forecast from a social media source (such as Twitter) can significantly improve the forecast. Using Twitter, the forecast can track changes in the polls that are related to external events, such as a debate. Overall, using Twitter as an exogenous variable for an autoregression model results in a 20% improvement in forecasting accuracy. When polls are changing due to external events, such as debates, the improvement can be as much as 350%. This improvement in accuracy demonstrates that social streams should be included in poll forecasting to ensure the most accurate results.
Featured Image courtesy of Wikimedia Commons, the free media repository