Social science researchers should get serious about using Google or other search data. Someone has to figure out what we can and cannot get from this amazing data. Here is some material to help motivate on this issue.
In each of the figures below, I compare real demographic data on state population composition with Google search patterns, using the Google correlate tool I’ve used before for divorce rates and Obama votes. If, as you can see, the correlation between the percentage of the population that is non-Latino-single-race-White and searches for “back in black guitar tab” is .89, what does it mean?
In case you’re prepared to be offended, remember this does not mean this is most of what these groups search for, or most of the searches in these areas. Rather, it’s the things that are searched for in these states that are not searched for in other states. So, people in all groups search for porn and shopping and restaurant reviews and health conditions — but these are the things that differentiate the states.
In each of the cases I’ve selected, I strongly suspect that the searchers using these terms are mostly the people in the demographic. But what good is it, and what are the risks?
White-alone, non-Hispanic and “back in black guitar tab” (4th highest correlation):
A bunch of the White ones were music, such as “walkin on sunshine,” “end of the world as we know it,” “wayward son,” and even “safety dance.”
Black-alone and “regina belle” (top correlation):
Also on the list, several about Black colleges, the pan hellenic system, Essence and Ebony, the Obamas and BET.
Latino and “solo tu lyrics” (top correlation):
Most of these are in Spanish and about pop culture.
Asian and “double eyelid” (top correlation):
(I removed Hawaii which is an extreme outlier, but it didn’t make much difference)
Lots of these are Korean words, and things about beauty.
American Indian alone and “native threads” (top correlation):
The biggest group here is about government agencies, like the Indian Health Service and Bureau of Indian Affairs; also beading, stitching and music.
Population age 65+ and “fosomax” (2nd highest correlation):
Most of these are old-age related health conditions and drugs (e.g., aortic stenosis), as well as Social Security and sympathy-related quotes (e.g., “losing someone quotes”), “the time of my life lyrics,” and “new family guy season.”
Population with BA degree or higher and “passport expiration” (5th highest correlation):
The Economist features in several places on this list, as does iPod stuff, things about travel to Europe (e.g., exchange rates), and “what the dog saw,” “baby jogger” and “index funds.”
See the complete list in a PDF document here.







[...] Philip Cohen makes good use of Google Correlate here and here. [...]
[...] Comments « Stuff White people Google [...]
[...] to investigate layers of behavior and meaning behind other observable social phenomena, such as race/ethnic composition, health behavior, and family patterns. Today’s example is [...]
[...] the disclaimer from my Stuff White People Google post applies here as well: In case you’re prepared to be offended, remember this does not mean [...]
[...] 5. Stuff White people Google: Answer: “Back in black guitar tab.” (If you don’t know what that is, you might not be White, male, and over 40.) The real point: “Someone has to figure out what we can and cannot get from this amazing data.” [...]
[...] Stuff White people Google: Race/ethnicity, age and education distributions across states all show correlations with searches for related concepts (like “Regina Belle” for Black population, and “fosomax” for population age 65+. [...]
[...] suggest that Blacks are doing racist searches? I don’t think so. I previously looked at state-level correlations between race/ethnic composition and search terms, and it looks to me like the most correlated [...]
[...] finding suggest that Blacks are doing racist searches? I don’t think so. I previously looked at state-level correlations between race/ethnic composition and search terms, and it looks to me like themost correlated [...]