Author Archives: Philip N. Cohen

Why Heritage is wrong on the new Census race/ethnicity question

Sorry this is long and rambly. I just want to get the main points down and I’m in the middle of other things. I hope it helps.

Mike Gonzalez, a Bush-era speech writer with no background in demography (not that there’s anything wrong with that), now a PR person for the Heritage Foundation, has written a noxious and divisive op-ed in the Washington Post that spreads some completely wrong information about the U.S. Census Bureau’s attempts to improve data collection on race and ethnicity. It’s also a scary warning of what the far right politicization of the Census Bureau might mean for social science and democracy.

Gonzalez is upset that “the Obama administration is rushing to institute changes in racial classifications,” which include two major changes: combining the Hispanic/Latino Origin question with the Race question, and adding a new category, Middle Eastern or North African (MENA). Gonzalez (who, it must be noted, perhaps with some sympathy, recently wrote one of those useless books about how the Republican party can reach Hispanics, made instantly obsolete by Trump), says that what Obama has in mind “will only aggravate the volatile social frictions that created today’s poisonous political climate in the first place.” Yes, the “poisonous political climate” he is upset about (did I mention he works for the Heritage Foundation?) is the result of the way the government divides people by race and ethnicity. Not actually dividing them, of course (which is a real problem), but dividing them on Census forms. (I hadn’t heard this particular version of why Trump is Obama’s fault — who knew?)

How will the new reforms make the Trump situation he helped create worse? Basically, by measuring race and ethnicity, which Gonzalez would rather not do (as suggested by the title, “Think of America as one people? The census begs to differ,” which could have been written at any time in the past two centuries).

Specifically, Gonzalez claims, completely factually inaccurately, that Census would “eliminate a second question that lets [Hispanics] also choose their race.” By combining Hispanic origin and race into one question — on which, as before, people will be free to mark as many responses as they like — Gonzalez thinks Census would “effectively make ‘Hispanic’ their sole racial identifier.” He is upset that many Latinos will not identify themselves as “White” if they have the option of “Hispanic” on the same question, even if they are free to mark both (which he doesn’t mention). Some will, but that is not because anyone is taking away any of their choices.

The Census Bureau, of course, because they always do, because they are excellent, has done years of research on these questions, including all the major stakeholders in a long interactive process that is scrupulously documented and (for a government bureaucracy) quite transparent. Naturally not everyone is happy, but in the end the trained demographic professionals come down on the side of the best science.

Race that Latino

The most recent report on the research I found was a presentation by Nicholas Jones and Michael Bentley from the Census Bureau. This is my source for the research on the new question.

First, why combine Hispanic with race? You have probably seen the phrase “Hispanics may be of any race” on lots of reports that use Census or other government data. The figure below is from the first edition of my book, using 2010 data, in which I group all 50 million Hispanics, and show the races they chose: about half White, the rest other race or more than one race (usually White and other race). Notice that by this convention Hispanics are removed from the White group anyway, just because we don’t want to have people in the same picture twice (“non-Hispanic Whites” is already a common construction).

family_fig03-02

The “may be of any race” language is the awkward outcome of an approach that treats Hispanic as an “ethnicity” (actually a bunch of national origins, maybe a panethnicity), while White, Black, Asian, Pacific Islander, and American Indian are treated as “races.” The distinction never really made sense. These things have been measured using self-identification for more than half a century, so we’re not talking about genetics and blood tests, we’re talking about how people identify themselves. And there just isn’t a major categorical difference between race and ethnicity for most people — people of any race or ethnicity may identify with a specific national origin (Italian, Pakistani, Mexican), as well as a “race” or panethnic identify such as Asian, or Latino. And now that the government allows people to select multiple races (since 2000), as well as answering the Hispanic question, there really is no good justification for keeping them separate. As you can see from my figure above, when we analyze the data we mostly pull all the Hispanics together regardless of their races. The new approach just encourages them to decide how they want that done, which is usually a better approach.

Of course, Asians and Pacific Islanders have been answering the “race” question with national origin prompts for several decades. There was no “Asian” checkbox in 2000 or 2010 (or on the American Community Survey). So they have been using their ethnicity to answer the race question all along — that’s because for some reason you just can’t get “Asian” immigrants, especially recent immigrants — that is, people from India, Korea, and Japan, Vietnam, and so on — to see themselves as part of one panethnic group. Go figure, must be the centuries of considering themselves separate peoples, even “races.” So, a new question that combines the more ethnic categories (Mexican, Pakistanis, etc.), with America’s racial identities (Black, White, etc.), just works better, as long as you let people check as many boxes as they want. This is what the “race” question looked like in 2014. Note there is no “Asian” checkbox:

acsrace2014

As a general guide, the questionnaire scheme works best when (a) everyone has a category they like, and (b) few people choose “other.” That is the system that will yield the most scientifically useful data. It also will tend to match the way people interact socially, including how they discriminate against each other, burn crosses on each other’s lawns, and randomly attack each other in public. We want data that helps us understand all that.

Through extensive testing, it has become apparent that, when given a question that offers both race and Hispanic origin together, Latino respondents are much more likely to answer Hispanic/Latino only, rather than cluttering up the race question with “some other race” responses (often writing in “Hispanic” or “Latino” as their “other race”). If I read the presentation right, in round numbers, given the choice of answering the “race” question with “Hispanic,” in the test data about 70% chose Hispanic alone; about 20% chose White along with Hispanic, and 5% choose two races. In fact, the number of Latinos saying their only race is White probably won’t change much; the biggest difference is that you no longer have almost 40% of Latinos saying they are “some other race,” or choosing more than one race (usually White and Other) which usually just means they don’t see a race that fits them on the list.

In the end, the size of the major groups (Hispanics and the major races) are not changed much. Here’s the summary:

betterhispanic

In fact, the only major group that will shrink is probably the non-group “multiracial” population, which today is dominated by Hispanics choosing White and “some other race.”

It’s really just better data. It’s not a conspiracy. It’s not eliminating the White race or discouraging assimilation of Hispanics. In short, keep calm and collect better data. We can fight about all that other stuff, too.

I’m sure Gonzalez doesn’t really think this will “eliminate Hispanics’ racial choices.” He’s dog-whistling to people who think the government is trying to reduce the number of Whites by not letting Hispanics be White. His statements are factually incorrect and the Washington Post shouldn’t have printed them. (I don’t know how the Post does Op-Eds; when I wrote one for the NY Times it was pretty thoroughly fact-checked.)

MENA

The Migration Policy Institute estimates there are about 2 million MENAs in the U.S. now, about half of them immigrants. This is a pretty small population, mostly Arab-speaking immigrants and their descendants, and more Christian (relative to Muslim) than the countries they left. This is especially true of the more recent immigrants, which don’t include a lot of Iranians (who aren’t Arab).

Census could have instead defined them by linguistic origin (Arab), and captured most, but they instead are going with country of origin, which is consistent with how the other race/ethnic groups are identified (for better or worse). Their testing showed that this measure captures most people with MENA ancestry, encourages them to identify their ancestry, cuts down on them identifying as White, and cuts down on them using “some other race.”

The difference is dramatic for those identifying as White, which fell from 85% to 20% in the test once a MENA category was offered. Would it be better if they just identified as White? I’m really not trying to shrink the count of Whites, I just think this is more accurate. I don’t care about the biology of Whiteness and whether Iranians are part of it, for example (and don’t ever say “Caucasian,” please), I care about the experience and identity of the people we’re talking about — as well as the beliefs of the people who hate them and those who want to protect them from discrimination. Counting them seems better than shoehorning them into a category most of them avoid when given the chance.

Here’s one version of the proposed new combined question, from that Census presentation:

newraceq

Yuck

Why not Mike Gonzalez to run Census? Unbelievably, he probably knows more about it than Trump’s education and HUD department heads know about their new portfolios.

But that’s just one odious possibility. It makes me kind of sick to think of the possible idiots and fanatics Trump might put in charge of the Census Bureau, after all this work on research and testing, designed to get the best data we can out of a very messy and imperfect situation. 

What else would they do? Will they continue to develop ways to identify and count same-sex couples? The Supreme Court says they can get married, but there is no law that says the Census Bureau has to count them. What about multilingual efforts to reach immigrant communities? This has been a focus of Census Bureau development as well. And so on.

It is absolutely in Trump’s interest, and the interests of those who he serves (not the people who voted for him), to reduce the quality and quantity of social science data the government produces and enables us to produce.

4 Comments

Filed under Uncategorized

My, what dimorphic parents you have!

Quick note to add the new Disney princess movie Moana to the animated gender series.

As in the case of Hercules, Disney can claim that the giant male Maui is a demigod so it’s normal that he’s many times larger than the princess, Moana. (There are a lot of large-bodied people in some Polynesian societies, but I don’t think this is a sex-specific pattern.) So instead look at Moana’s parents.

moanaparents

His big toe has the same diameter as her wrist. His unflexed bicep is wider than her waist. (Sources say the voice actor for Maui has 20-inch biceps, while a real life-sized Barbie doll would have an 18-inch waist, compared with 31 inches for a typical 19-year-old woman.) Anyway, it’s ridiculous.

But this is not unusual for animated kids-movie parents. Here are the parents from Brave and How to Train Your Dragon:

braveparents

dragonparents

So, extreme dimorphism among parents is common in this genre. Why? I can’t say for sure, but here’s a clue — the parents from Frozen:

frozenparents

My, how similar their bodies are! Sure, her eyes are bigger than his mouth, and his hand is a little engorged, but that’s because there’s a baby in the scene. In the scale of things, they’re practically twins.

If the difference is in racial or ethnic context for the families, then maybe extreme dimorphism among parents helps signify the exoticism of the culture depicted. Of course Black men are often stereotyped as having superhuman bodies, but super petite women don’t usually go along with that particular trope, so I’m not sure how to interpret this. Ideas welcome.

1 Comment

Filed under In the news

How do Black-White parents identify their children?

In 2015 the American Community Survey yields an estimate of 66,913 infants who have one Black parent and one White parent present in the household. (Either parent may be multiracial, too.)

What is the race of those infants? 73% of them were identified as both White and Black by whoever filled out the Census form.

bwinfants

(Note “other” doesn’t mean they specified “other,” it just means they used some other combination of races.)

These are children age 0 living with both parents, so it’s a pretty good bet they’re mostly biological parents, though some are presumably adopted. This is based on a sample of 507 such infants. If you pooled some years of ACS there is plenty to study here. Someone may already have done this – feel free to post in the comments.

That’s it, just FYI.

1 Comment

Filed under In the news

Electoral representation by demographic group

I’m told that one point of our electoral system is to ensure representation of small states. That’s why small states get two senators even if they have tiny populations, and why each state gets at least three electors in the electoral college (equal to the size of their Congressional delegation). You could make a case for finding ways to make sure small groups are represented, even over-represented, because otherwise they would be ignored. So you discount California voters to make sure Wyoming voters get to be part of the process.

Regardless of the history, which suggests the electoral college was created to protect the interests of slave owners, it’s now the case that Whites have more power in the electoral college, because they dominate the small states. As Lara Merling and Dean Baker show, Blacks have 5% less representation, Latinos have 9% less, and Asian Americans have 7% less representation than Whites.

So it is unfair in its results by the contemporary race/ethnic distribution, but that’s not a fixed quality of the system (it’s merely very durable). Underlying the premise, though, is the idea that the identities to be represented are geographic in nature. There are some issues that have geographic boundaries, like land use or climate-related questions, but the point of an analysis like Merling and Baker’s — like much of Civil Rights law — is that identities also adhere in demographic groups, by gender, race/ethnicity, and age. So the geographic system creates inequities according the demographic system. I don’t see why we should prioritize the geographic in our electoral system, now that geography is so much less of a defining feature in our communication systems and popular culture.

What if we redid the electoral system by the demographic categories of gender, race/ethnicity and age, and then let geographic groups complain if they end up underrepresented, instead of the other way around? Before you write to the governor (again) and demand that I be fired: This does not even rise to the level of a suggestion, it’s literally just a thought.

Here’s how it would look, if we divided 435 seats across 40 demographic identity states, using data from the 2015 American Community from IPUMS.org*:

newhor

Compared with the 114 Congress (the one finishing now), this one is more diverse, with 224 instead of 108 women, 56 versus 38 Latinos, 24 versus 13 Asian/Pacific Islanders, and 8 versus 2 American Indians. Only Blacks fare a little worse, dropping form 47 to 44. This also gives us a great improvement in age diversity, as the current average age in the House is 57 and this distribution implies an average age of something like 47.

For comparison, here is the Electoral College we would get under this system, which simply adds two electors to each of these House of Representatives districts, representing their Senate delegations:

newec

Now instead of fighting over New Hampshire or Wyoming, presidential candidates would campaign for swing-groups such as middle-aged American Indians, or young Latinos.

This system would also have a built in version of term limits feature, as people who aged out of their districts presumably would have to run in the next age group up. People who changed gender or race/ethnic identity could also switch districts.

Someone could take some voter or opinion data and figure out how our elections would turn out with this (if someone already has done this, please add it in the comments).


* Because they rounded to zero, I added one House seat to old American Indian men and women, and took one away from middle-aged White women, the largest group. Note also that we might have to redistrict this when the race categories change, as they are expected to in 2020, to add Middle Eastern / North Africans (MENAs).

15 Comments

Filed under Politics

’16 and Pregnant’ and less so

3419870216_fded1624d2_z

From Flickr/CC: https://flic.kr/p/6dcJgA

Regular readers know I have objections to the framing of teen pregnancy, as a thing generally and as a problem specifically, separate from the rising age at childbearing generally (see also, or follow the teen births tag).

In this debate, one economic analysis of the effect of the popular MTV show 16 and Pregnant has played an outsized role. Melissa Kearney and Phillip Levine showed that was more decline in teen births in places where the show was popular, and attempted to establish that the relationship was causal — that the show makes people under age 20 want to have babies less. As Kearney put it in a video promoting the study: “the portrayal of teen pregnancy, and teen childbearing, is something they took as a cautionary tale.” (The paper also showed spikes in Twitter and Google activity related to birth control after the show aired.)

This was very big news for the marriage promotion people, because it was taken as evidence that cultural intervention “works” to affect family behavior — which really matters because so far they’ve spent $1 billion+ in welfare money on promoting marriage, with no effect (none), and they want more money.

The 16 and Pregnant paper has been cited to support statements such as:

  • Brad Wilcox: “Campaigns against smoking and teenage and unintended pregnancy have demonstrated that sustained efforts to change behavior can work.”
  • Washington Post: “By working with Hollywood to develop smart story lines on popular shows such as MTV’s ’16 and Pregnant’ and using innovative videos and social media to change norms, the [National Campaign to Prevent Teen and Unplanned Pregnancy] has helped teen pregnancy rates drop by nearly 60 percent since 1991.”
  • Boston Globe: “As evidence of his optimism, [Brad] Wilcox points to teen pregnancy, which has dropped by more than 50 percent since the early 1990s. ‘Most people assumed you couldn’t do much around something related to sex and pregnancy and parenthood,’ he said. ‘Then a consensus emerged across right and left, and that consensus was supported by public policy and social norms. . . . We were able to move the dial.’ A 2014 paper found that the popular MTV reality show ’16 and Pregnant’ alone was responsible for a 5.7 percent decline in teen pregnancy in the 18 months after its debut.”

I think a higher age at first birth is better for women overall, health permitting, but I don’t support that as a policy goal in the U.S. now, although I expect it would be an outcome of things I do support, like better health, education, and job opportunities for people of color and people who are poor.

Anyway, this is all just preamble to a new debate from a reanalysis and critique of the 16 and Pregnant paper. I haven’t worked through it enough to reach my own conclusions, and I’d like to hear from others who have. So I’m just sharing the links in sequence.

The initial paper, posted as a (non-peer reviewed) NBER Working Paper in 2014:

Media Influences on Social Outcomes: The Impact of MTV’s 16 and Pregnant on Teen Childbearing, by Melissa S. Kearney, Phillip B. Levine

This paper explores how specific media images affect adolescent attitudes and outcomes. The specific context examined is the widely viewed MTV franchise, 16 and Pregnant, a series of reality TV shows including the Teen Mom sequels, which follow the lives of pregnant teenagers during the end of their pregnancy and early days of motherhood. We investigate whether the show influenced teens’ interest in contraceptive use or abortion, and whether it ultimately altered teen childbearing outcomes. We use data from Google Trends and Twitter to document changes in searches and tweets resulting from the show, Nielsen ratings data to capture geographic variation in viewership, and Vital Statistics birth data to measure changes in teen birth rates. We find that 16 and Pregnant led to more searches and tweets regarding birth control and abortion, and ultimately led to a 5.7 percent reduction in teen births in the 18 months following its introduction. This accounts for around one-third of the overall decline in teen births in the United States during that period.

A revised version, with the same title but slightly different results, was then published in the top-ranked American Economic Review, which is peer-reviewed:

This paper explores the impact of the introduction of the widely viewed MTV reality show 16 and Pregnant on teen childbearing. Our main analysis relates geographic variation in changes in teen childbearing rates to viewership of the show. We implement an instrumental variables (IV) strategy using local area MTV ratings data from a pre-period to predict local area 16 and Pregnant ratings. The results imply that this show led to a 4.3 percent reduction in teen births. An examination of Google Trends and Twitter data suggest that the show led to increased interest in contraceptive use and abortion.

Then last month David A. Jaeger, Theodore J. Joyce, and Robert Kaestner posted a critique on the Institute for the Study of Labor working paper series, which is not peer-reviewed:

Does Reality TV Induce Real Effects? On the Questionable Association Between 16 and Pregnant and Teenage Childbearing

We reassess recent and widely reported evidence that the MTV program 16 and Pregnant played a major role in reducing teen birth rates in the U.S. since it began broadcasting in 2009 (Kearney and Levine, American Economic Review 2015). We find Kearney and Levine’s identification strategy to be problematic. Through a series of placebo and other tests, we show that the exclusion restriction of their instrumental variables approach is not valid and find that the assumption of common trends in birth rates between low and high MTV-watching areas is not met. We also reassess Kearney and Levine’s evidence from social media and show that it is fragile and highly sensitive to the choice of included periods and to the use of weights. We conclude that Kearney and Levine’s results are uninformative about the effect of 16 and Pregnant on teen birth rates.

And now Kearney and Levine have posted their response on the same site:

Does Reality TV Induce Real Effects? A Response to Jaeger, Joyce, and Kaestner (2016)

This paper presents a response to Jaeger, Joyce, and Kaestner’s (JJK) recent critique (IZA Discussion Paper No. 10317) of our 2015 paper “Media Influences on Social Outcomes: The Impact of MTV’s 16 and Pregnant on Teen Childbearing.” In terms of replication, those authors are able to confirm every result in our paper. In terms of reassessment, the substance of their critique rests on the claim that the parallel trends assumption, necessary to attribute causation to our findings, is not satisfied. We present three main responses: (1) there is no evidence of a parallel trends assumption violation during our sample window of 2005 through 2010; (2) the finding of a false placebo test result during one particular earlier window of time does not invalidate the finding of a discrete break in trend at the time of the show’s introduction; (3) the results of our analysis are robust to virtually all alternative econometric specifications and sample windows that JJK consider. We conclude that this critique does not pose a serious threat to the interpretation of our 2015 findings. We maintain the position that our earlier paper is informative about the causal effect of 16 and Pregnant on teen birth rates.

So?

There are interesting methodological questions here. It’s hard to identify the effects of interventions that are swimming with the tide of change. In fact, the creation of the show, the show’s popularity, the campaign to end teen pregnancy, and the rising age at first birth may all be outcomes of the same general historical trend. So I’m not that invested in the answer to this question, though I am very interested.

There are also questions about the publication process, which I am very invested in. That’s why I work to promote a working paper culture among sociologists (through the SocArXiv project). The original paper was posted on a working paper site without peer review, but NBER is for economists who already are somebody, so that’s a kind of indirect screening. Then it was accepted in a top peer-reviewed journal (somewhat revised), but that was after it had received major attention and accolades, including a New York Times feature before the working paper was even released and a column devoted to it by Nicholas Kristof.

So is this a success story of working paper culture gone right — driving attention to good work faster, and then also drawing the benefits of peer review through the traditional publication process? (And now continuing with open debate on non-gated sites). Or is it a case of political hype driving attention inside and outside of the academy — the kind of thing that scares researchers and makes them want to retreat behind the slower, more process-laden research flow which they hope will protect them from exposure to embarrassment and protect the public from manipulation by the credulous news media. I think the process was okay even if we do conclude the paper wasn’t all it was made out to be. There were other reputational systems at work — faculty status, NBER membership, New York Times editors and sources — that may be as reliable as traditional peer review, which itself produces plenty of errors.

So, it’s an interesting situation — research methods, research implications, and research process.

2 Comments

Filed under Research reports

Don’t flatter yourself, Trump’s America

So Hillary Clinton apparently said,

My dream is a hemispheric common market, with open trade and open borders, some time in the future with energy that is as green and sustainable as we can get it, powering growth and opportunity for every person in the hemisphere.

What’s the big deal? Anyone who doesn’t occasionally dream of open borders either hasn’t dreamed much or doesn’t have very high ambitions for the unity of the human race. (She says now it was really just about energy policy, but come on.)

Of course, in the context of speaking to a bunch of .01% bankers, that’s not really the point: it’s a signal that she leans in their direction on “free” trade and movement of labor, so it’s probably not quite the unicorn-style dream I have in mind (I already criticized her currently-expressed vision on this, which reflects her adaptation to the political moment, and might last the rest of her career.)

Anyway, my point is about Trump’s reaction (video).

…frankly when you’re working for Hillary, she wants to let people just pour in. You could have 650 million people pour in and we do nothing about it. Think of it, that’s what could happen. You triple the size of our country in one week.

Two points. First is this is idiotic. I can only guess that whoever gave him that number was adding together the entire populations of North and South America, minus the U.S., which is actually almost exactly 650 million. So, in a week, if allowed, every single person in our hemisphere would move into the US.

Now, those of us in the dream-of-open-borders community do dream of these things. I wrote this once, after imagining combining the populations of the US and Central American countries, as well as Israel and with occupied territories:

This simplistic analysis yields a straightforward hypothesis: violence and military force at national borders rises as the income disparity across the border increases. … The demographic solution is obvious: open the borders, release the pressure, and devote resources to improving quality of life and social harmony instead of enforcing inequality. You’re welcome!

I wasn’t really talking about people moving, but rather about borders moving, or being taken down. How many people would actually move is an interesting question, one which I hope will be important one day.

For perspective, you might compare Trump’s fear-mongering in scale to the largest ever migration of people, the movement into the cities of China, during which something like 340 million people moved in about 20 years. It wasn’t pretty! It also wasn’t an “open borders” situation, as most of them weren’t really allowed to move, resulting in a bad situation of second-class citizenship for many of the migrants and their children. Thankfully, it also didn’t take place in a week — although just the annual new year travel in China (which largely results from the separations their great migration has caused) generates some of the most spectacular traffic images ever:

pay-traffic-jam-on-beijinghong-kongmacau-expressway

Second, it’s insulting. Like Trump’s description of African Americans living in “hell,” where they have nothing to lose, “your schools are no good, you have no jobs,” etc.:

Many people have made the point that Trump’s sympathy regarding Black hardship is drowned out by his grotesque stereotyping and dehumanizing dismissal. I haven’t heard the same said about his 650-million-migrants claim, but it’s really the same thing.

I’m sure a lot of people would move to the US if the borders were opened. But I bet at least a few people somewhere between Canada and Chile would find a reason to stay in their homes. And not just because that would keep them away from us.


Related: Must-know demographic facts (it couldn’t hurt!)

2 Comments

Filed under Politics

Racist, sexist, and anti-Semitic jokes in Trump land

This post contains racist language.

Updated: See comment note and data caution at the end.

This is purely observational, not causal. People Google for racist, sexist, and anti-Semitic jokes more in states that are more favorable toward Trump in the presidential election.

The point of the exercise, as suggested by Seth Stephens-Davidowitz in a 2012 paper published here and discuss here, was to look for population traits that might skew votes in ways the polls did not predict. If people were racist, maybe they would not admit they opposed Obama, but they would still Google “nigger jokes” in their spare time. We don’t yet know whether the polls will accurately capture the vote outcome this year, but I’m interested in the underlying patterns anyway.

I use state data from Google Trends, which coughed up relative search frequencies for the past fives years by state. Each search term is scaled from 100 in the state with the highest search frequency of the term to zero for the lowest (except they don’t go down to zero). For example, West Virginia scores 100 on searches for “nigger jokes” and Oregon scores 17 (the lowest score). Trends does not report the actual number of searches, and some small states are not reported for some jokes, presumably because the data are too sparse.

So here I compare search frequencies for three offensive kinds of jokes, “blonde jokes” (N=48), “nigger jokes” (N=38), and “holocaust jokes” (N=29), with controls for two kinds of innocuous jokes “puns” (favored by Clinton supporting-states) and “knock knock jokes” (favored in Trump states). This might capture the general tendency to Google for jokes. I compared these relative search frequencies to the state polling summary from FiveThirtyEight, which has the Clinton lead from +32.8 in Hawaii to -30.4 in Wyoming (DC is not included here).

The bivariate correlations with the Clinton lead are -.67 for “blonde jokes,” -.61 for “nigger jokes,” and -.48 for “holocaust jokes.” Here are the scatters (click to enlarge):

Again, nothing causal claimed here. Just accounting for other joke telling (which is interesting in itself, here are the multivariate results:

jokes-clinton-ols

Blonde provides the best fit but they all are still pretty good with the innocuous jokes controlled.

Incidentally, “puns” has no bivariate correlation with Clinton lead, but with “knock knock” controlled it’s very strong. Go figure!

OK, there you have it. Deplorable joke behavior is strongly correlated with Trump support. Nothing causal claimed here.


I put the data and Stata code, including code for the figures, on the Open Science Framework here.

For other relevant posts follow the Google tag and the Trump tag.


Update

Thanks to the efforts of University of Wisconsin graduate student Nathan Seltzer (see the comment below), it’s come to my attention that the “past five years”data is unstable. Looking just at the “holocaust jokes” data, s/he found non-trivial noise comparing the downloads just a few hours apart. To check this, I just went and repeated the search: “holocaust jokes” for “past five years,” and this is that I got:

holocaust-change-table

Yuck. Thanks for the free data, Google! I’m thankful for Nathan pointing this out. Good lesson in the benefits of sharing data so we can find problems like this — and the trouble with counting on non-open, private data providers like Google. When they’re good, they’re good, but they’re non-transparent and unaccountable when they’re not. It would be great if Google figured out what’s going on and fixed their public access tool. If anyone else can explain this I would be interested to hear.

5 Comments

Filed under Politics