Hey—we've moved. Visit
The Keyword
for all the latest news and stories from Google
Official Blog
Insights from Googlers into our products, technology, and the Google culture
Mining patterns in search data with Google Correlate
May 25, 2011
It all started with the flu. In 2008, we found that the activity of certain search terms are good indicators of actual flu activity. Based on this finding, we launched
Google Flu Trends
to provide timely estimates of flu activity in 28 countries. Since then, we’ve seen a number of other researchers—including
our very own
—use search activity data to estimate other real world activities.
However, tools that provide access to search data, such as
Google Trends
or
Google Insights for Search
, weren’t designed with this type of research in mind. Those systems allow you to enter a search term and see the trend; but researchers told us they want to enter the trend of some real world activity and see which search terms best match that trend. In other words, they wanted a system that was like Google Trends but in reverse.
This is now possible with
Google Correlate
, which we’re launching today on
Google Labs
. Using Correlate, you can upload your own data series and see a list of search terms whose popularity best corresponds with that real world trend. In the example below, we uploaded official flu activity data from the U.S. CDC over the last several years and found that people search for terms like [cold or flu] in a similar pattern to actual flu rates. Finding out these correlated terms is how we built Google Flu Trends:
You can also enter a search term such as [
ribosome
] and find other terms whose activity corresponds well over time with the one you’re interested in:
It turns out cell biology isn’t all too popular in the summer time (sorry biologists!). What’s interesting is that the ups and downs of web search activity for cell biology terms is unique enough that searching on Correlate for [ribosome] brings up searches for other biology terms, such as [mitochondria]. Of course, correlation isn’t the same thing as causation, so we can’t explain why two terms follow the same pattern. But my guess in this case is that both terms are popular when schools teach these concepts.
Search activity is an incredible source of data that may lead to advances in economics, health and other fields; but we need to handle that data with privacy controls in mind. With this system, we don’t care what any one person is searching for. In fact, we rely on millions of anonymized search queries issued to Google over time, and the patterns we observe in the data are only meaningful across large populations.
We encourage you to read our
white paper
describing the methodology behind Google Correlate. Or for lighter reading, check out our
comic
! We’ve enjoyed uploading different data sets to see fascinating and sometimes perplexing correlations. Plug in your data and
let us know
what you find.
Posted by Matt Mohebbi, Software Engineer
Labels
accessibility
41
acquisition
26
ads
131
Africa
19
Android
58
apps
419
April 1
4
Asia
39
books + book search
48
commerce
12
computing history
7
crisis response
33
culture
12
developers
120
diversity
35
doodles
68
education and research
144
entrepreneurs at Google
14
Europe
46
faster web
16
free expression
61
google.org
73
googleplus
50
googlers and culture
202
green
102
Latin America
18
maps and earth
194
mobile
124
online safety
19
open source
19
photos
39
policy and issues
139
politics
71
privacy
66
recruiting and hiring
32
scholarships
31
search
505
search quality
24
search trends
118
security
36
small business
31
user experience and usability
41
youtube and video
140
Archive
2016
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2007
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2006
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2005
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2004
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Google
on
Follow @google
Follow
Give us feedback in our
Product Forums
.