Google Trends has become an accepted tool for academic research. A researcher can enter a keyword, such as “iPhone”, and see the search volume for that keyword over time, normalized by the total search volume. Similar social media tools, like Topsy, report the number of tweets mentioning the keyword over time, without normalizing. These raw counts – or impressions – are great for advertisers and marketers. But how do raw counts (not normalized counts) fare for serious research?
The research community has not yet accepted the analytics role of social media as it has with Google Trends. This could simply be a case of the academic community being slow to adopt new technology (Google’s Internet search engine is now 16 years old, while social media has only taken off in the last five years). A first step towards acceptance is demonstrating that the same results can be achieved using both tools. In this post, I will compare trends for the keyword “iPhone”, using Google and Twitter, from 2011 through 2014, during the month when the new iPhone was released: Sept. 2011 (iPhone 4s), Oct. 2012 (iPhone 5), Oct. 2013 (iPhone 5s & 5c), and Oct. 2014 (iPhone 6 & 6+). I chose to study the iPhone release since it has a strong, isolated peak in both Google search volume and Twitter mentions over these four years.
Google shows an interesting pattern
First, let’s look at Google Trends for the keyword “iPhone” (Figure 1). This graph has some very striking aspects. The peaks (corresponding to the weeks when the new iPhone was released) are not steadily increasing over time. This is surprising since the first weekend of iPhone sales have been steadily increasing year over year: iPhone 4s (4M), iPhone 5 (5M), iPhone 5s and 5c (9M), and iPhone 6 and 6+ (10M). This shows very concretely that interest, as expressed by search queries, is not a predictor of consumer demand.
Figure 1: Google Trends for keyword “iPhone” from September 2011 to October 2014.
Next, let’s look at the magnitude of the peaks. The highest peaks correspond to the iPhone 5 and iPhone 6 releases, while the lowest peaks occur for the iPhone 4s, 5s and 5c. Qualitatively, this shows there is less search interest for the incremental releases (the iPhone 4s, 5s and 5c) than for the new platforms. This makes sense, since the incremental releases typically are minor hardware improvements, lacking the new design and features that stoke public curiosity.
Perhaps we need to treat the full releases and the incremental release separately: compare the change in interest for the iPhone 5 to the iPhone 6 and 6+, against the interest for the iPhone 4s to the iPhone 5s and 5c. But even here, search volume is a bad predictor of consumer demand. Comparing either the major or incremental release, interest in the iPhone appears to be decreasing with time.
Twitter contradicts Google
Let’s focus on the raw Twitter data. I produced a single sample of Twitter users and collected their tweets for the two months around the iPhone releases: 200,972 tweets from Sept. 15, 2011 to Nov. 15, 2011; 416,273 tweets from Aug. 15, 2012 to Oct. 15, 2012; 582,281 tweets from Aug. 15, 2013 to Oct. 15, 2013; and 993,576 tweets from Aug. 15, 2014, to Oct. 15, 2014.
Figure 2 shows the counts of Twitter users mentioning the iPhone in these periods compared with the Google search volume. The difference is obvious. While both Twitter and Google show increased interest for the new platform releases (iPhone 5, 6 and 6+), Twitter shows increasing interest over time while Google shows a decrease. According to Google, fewer people queried iPhone for the iPhone 6 and 6+ launch than for the iPhone 5 launch. In contrast, Twitter says that more people were discussing the iPhone 6 and 6+ than when the iPhone 5 was released.
Figure 2: Twitter mentions (orange) compared to Google search volume (grey) for the keyword iPhone. 95% sampling confidence for Twitter mentions is shown as the yellow ribbon.
Does this mean that Google is wrong and Twitter is correct? Before jumping to conclusions, consider that Twitter more than doubled its active users between Sept. 2011 to Oct. 2014, growing from 101M to 284M. This increase in active users may be what we are seeing. The only way to tell is to normalize the mentions with the total number of active users.
Rectifying Twitter and Google
To normalize the Twitter data, I counted (from the sample) the number of unique users posting each week. I then divided the number of people mentioning iPhone each week by this active users count. Figure 3 shows the comparison of normalized Twitter data to Google data.
Figure 3: Normalized Twitter mentions (orange) compared to Google search volume (grey) for the keyword iPhone. 95% sampling confidence for Twitter mentions is shown as the yellow ribbon.
Normalizing the data actually brings Twitter and Google in line. The agreement between the two data sets is very good, with the Google data well within the 95% confidence interval from the Twitter sample. This shows that Twitter data, when properly normalized, gives the same result as Google! When properly measured, social media should be as trusted a tool for research as Google Trends.
I have proven that unnormalized counts produce incorrect results. Using Google Trends as a ground truth, simply collected raw Twitter mentions disagrees qualitatively with the Google search volume. Normalizing the data gives quantitative agreement between the two methods.
Unfortunately, most social media tools do not normalize their data, reporting raw counts instead. And none of these tools (or Twitter) will release total daily, weekly or monthly users by which to normalize the data, correcting for the steady increase over time that is inherent in the Twitter data. To get accurate, normalized data, you need to create your own, properly sampled data set.
- Preis, Tobias, Helen Susannah Moat, H. Eugene Stanley, and Steven R. Bishop. “Quantifying the advantage of looking forward.” Scientific reports 2 (2012).↵
- Preis, Tobias, Helen Susannah Moat, and H. Eugene Stanley. “Quantifying trading behavior in financial markets using Google Trends.” Scientific reports 3 (2013).↵
- The 95% confidence interval is the estimated range the value would take 19 out of 20 times if the experiment was repeated.↵