Philip N. Cohen criticized the use of generation labels. Generations are one of many analytical lenses researchers use to understand societal change and differences across groups. While there are limitations to generational analysis, it can be a useful tool for understanding demographic trends and shifting public attitudes. For example, a generational look at public opinion on a wide range of social and political issues shows that cohort differences have widened over time on some issues, which could have important implications for the future of American politics.
In addition, looking at how a new generation of young adults experiences key milestones such as educational attainment, marriage or homeownership, compared with previous generations in their youth, can lend important insights into changes in American society.
To be sure, these labels can be misused and lead to stereotyping, and it’s important to stress and highlight diversity within generations. At Pew Research Center, we consistently endeavor to refine and improve our research methods. Therefore, we are having ongoing conversations around the best way to approach generational research. We look forward to engaging with Mr. Cohen and other scholars as we continue to explore this complex and important issue.
Kim Parker, Washington
I was happy to see this, and look forward to what they come up with. I am also glad to see that there has been no substantial defense of the current “generations” research regime. Some people on social media said they kind of like the categories, but no researcher has said they make sense, or pointed to any research justifying the current categories. With regard to her point that generations research is useful, that was in our open letter, and in my op-ed. Cohorts (and, if you want to call a bunch of a cohorts a generation, generations) matter a lot, and should be studied. They just shouldn’t be used with imposed fixed categories regardless of the data involved, and given names with stereotyped qualities that are presumed to extend across spheres of social life.
Several people have asked me for suggestions. My basic suggestion is to do like you learned in social science class, and use categories that make sense for a good reason. If you have no reason to use a set of categories, don’t use them. Instead, use an empty measure of time, like years or decades, as a first pass, and look at the data. As I argued here, there is not likely to be a set of birth years that cohere across time and social space into meaningful generational identities.
In the Op-Ed, I wrote this: “Generation labels, although widely adopted by the public, have no basis in social reality. In fact, in one of Pew’s own surveys, most people did not identify the correct generation for themselves — even when they were shown a list of options.” The link was to this 2015 report titled, “Most Millennials Resist the ‘Millennial’ Label” (which of course confirms a stereotype about this supposed generation). I was looking in particular at this graphic, which I have shown often:
It doesn’t exactly show what portion of people “correctly” identify their category, but I eyeballed it and decided that if only 18% of Silents and 40% of Millennials were right, there was no way Gen X and Boomers were bringing the average over 50%. Also, people could choose multiple labels, so those “correct” numbers was presumably inflated to some degree by double-clickers. Anyway, the figure doesn’t exactly answer the question.
The data for that figure come from Pew’s American Trends Panel Wave 10, from 2015. The cool thing is you can download the data here. So I figured I could do a little analysis of who “correctly” identifies their category. Unfortunately, the microdata file they share doesn’t include exact age, just age in four categories that don’t line up with the generations — so you can’t replicate their analysis.
However, they do provide a little more detail in the topline report, here, including reporting the percentage of people in each “generation” who identified with each category. Using those numbers, I figure that 57% selected the correct category, 26% selected an incorrect category, 9% selected “other” (unspecified in the report), and 8% are unaccounted for. So, keeping in mind that people can be in more than one of these groups, I can’t say how many were completely “correct,” but I can say that (according to the report, not the data, which I can’t analyze for this) 57% at least selected the category that matched their birth year, possibly in combination with other categories.
The survey also asked people “how well would you say they term [generation you chose] applies to you?” If you combine “very well” and “fairly well,” you learn, for example, that actual “Silents” are more likely to say “Greatest Generation” applies well to them (32%) than say “Silent” does (14%). Anyway, if I did this right, based on the total sample, 46% of people both “correctly” identified their generation title, and said the term describes them “well.” I honestly don’t know what to make of this, but thought I’d share it, since it could be read as me misstating the case in the Op-Ed.
Marital Name Change Survey first results and open data release.
Over the last three days 3,400 ever-married U.S. residents took my Marital Name Change Survey. I distributed the survey link on this blog, Facebook and Twitter. I don’t know who took it, but based on the education and occupation data a very large share of the respondents were women (88%) with professional degrees (30%) or Phds (27%). It’s not a representative sample, but the results may still be interesting.
Here I’ll give a few topline numbers as of 8:00 this morning, and then link to a public version of the data and materials. These results reflect a little data checking and cleaning and of course are subject to change.
Respondents were asked about their most recent marriage. Half were married in the 2010s, but the sample includes more than 400 married in the 1990s and 200 earlier.
The vast majority (84%) were women married to men; 11% were men married to women and 4% (~140) were in same-gender marriages. Here are some observations about the women married to men. The name-change choices are shown below, with “R change” indicating the respondent changed their name, and “Sp change” indicating their spouse changed. The “Other” field included a write-in, and the vast majority of those were variations on hyphenations or changes to middle names.
Because of the convenience nature of the sample, I don’t put much stock in the overall trend (I’ll try to develop a weighting scheme for this, but even then). However, I think the PhD sample is worth looking at. Here is the trend of women with PhDs (now or at the time of marriage) married to men.
By this reckoning, the feminist-name heyday was in the 1980s, followed by a backslide, and now a rebound of women with PhDs keeping their names. The 2010s trend is like that found in the Google Consumer survey reported by Claire Cain Miller and Derek Willis in NYT Upshot.
Note, these no-change rates are higher than those reported by Gretchen Gooding and Rose Kreider from the 2004 American Community Survey, which showed 33% of married women with PhDs had different surnames than their husbands (regardless of when they got married). I show 53% in the 2000s had different names than their husbands, and 57% in the 2010s. Maybe that’s because I have more social science and humanities PhDs, or just a more woke sample.
These results also show a strong age-at-marriage pattern, with PhD women much more likely to keep their names if they married at older ages. Over age 40, 74% of women with PhDs kept their names, compared with 20% who married under age 25. (Note this is based on education at the time of the survey; I also collected education at the time of marriage, which I discuss below.)
I asked people how important various factors were if people considered changing their names. Among PhD women marrying men who did not change their names, the most important reasons were feminism (52% “very important”), professional considerations (34%), convenience (33%), and maintaining independence within the marriage (24%). Among those who took their husbands’ names, the most important factors were the interests of their children (48%) and showing commitment to the marriage (25%).
A few other observations: PhD women were most likely to keep their names if they had no religion (53%), were Jewish (46%), or other non-Christian religion (43%); protestants (27%), Catholics (29%), and other Christians (21%) were less likely to keep their names. Finally, those who had lived together before marriage were most likely to keep their names (51% for those who lived together for three years or more, compared with 27% for those who did not live together at all).
I don’t have time now to analyze this more, but that shouldn’t stop you. Feel free to download the data and documentation here under a CC-BY license (the only requirement is attribution). This includes a Stata data file, and PDFs of the questionnaire and codebook. This will all be revised when I have time.
I am not including in the shared files (yet) the open-ended question responses, which include descriptions of “other” name change patterns, as well as a general notes field, which is full of fascinating comments; given the non-random nature of the survey, this may turn out to be its most valuable contribution.
Here are a few.
I changed my name to my spouses because I HATED my father and it was the easiest way to ditch his name. I kept my married name after divorce. I’m currently pregnant (on my own) and plan to change my name again and now I will take the surname of my step-father, who has been my “dad” since I was 5.
My wife and I had been together 10 years and through several iterations of domestic partnerships prior to marrying. Including before she completed her PhD. I didn’t want to change my name because my name flows really poetically and a change would ruin it (silly but true). She didn’t want to change her name in part because it’s what everyone in her profession know her as. I think we both also feel like our names represent our life histories and although we are a true partnership, that doesn’t negate our family histories or experiences. Which I guess is feminist of us. But we never explicitly discussed feminism as an issue.
This is complicated.
My partner and I both had our own hyphenated names already! We kept our own hyphenated names initially (and our marriage was not legally recognized at the time so there wasn’t a built-in or convenient option to change at that point anyway). When we had kids, we have them a hyphenated name, one of my last names and one of hers. Eventually we both changed to match the kids, so we all share the same hyphenated name now.
In demography, there is a well-known phenomenon known as age-heaping, in which people round off their ages, or misremember them, and report them as numbers ending in 0 or 5. We have a measure, known as Whipple’s index, that estimates the extent to which this is occurring in a given dataset. To calculate this you take the number of people between ages 23 and 62 (inclusive), and compare it to five-times the number of those whose ages end in 0 or 5 (25, 30 … 60), so there are five-times as many total years as 0 and 5 years.
If the ratio of 0/5s to the total is less than 105, that’s “highly accurate” by the United Nations standard, a ratio 105 to 110 is “fairly accurate,” and in the range 110 to 125 age data should be considered “approximate.”
I previously showed that the American Community Survey’s (ACS) public use file has a Whipple index of 104, which is not so good for a major government survey in a rich country. The heaping in ACS apparently came from people who didn’t respond to email or mail questionnaires and had to be interviewed by Census Bureau staff by phone or in person. I’m not sure what you can do about that.
What about marriage?
The ACS has a great data on marriage and marital events, which I have used to analyze divorce trends, among other things. Key to the analysis of divorce patterns is the question, “When was this person last married?” (YRMARR) Recorded as a year date, this allows the analyst to take into account the duration of marriage preceding divorce or widowhood, the birth of children, and so on. It’s very important and useful information.
Unfortunately, it may also have an accuracy problem.
I used the ACS public use files made available by IPUMS.org, combining all years 2008-2017, the years they have included the variable YRMARR. The figure shows the number of people reported to have last married in each year from 1936 to 2015. The decadal years are highlighted in black. (The dropoff at the end is because I included surveys earlier than those years.)
Yikes! That looks like some decadal marriage year heaping. Note I didn’t highlight the years ending in 5, because those didn’t seem to be heaped upon.
To describe this phenomenon, I hereby invent the Decadally-Biased Marriage Recall index, or DBMR. This is 10-times the number of people married in years ending in 0, divided by the number of people married in all years (starting with a 6-year and ending with a 5-year). The ratio is multiplied by 100 to make it comparable to the Whipple index.
The DBMR for this figure (years 1936-2015) is 110.8. So there are 1.108-times as many people in those decadal years as you would expect from a continuous year function.
Maybe people really do get married more in decadal years. I was surprised to see a large heap at 2000, which is very recent so you might think there was good recall for those weddings. Maybe people got married that year because of the millennium hoopla. When you end the series at 1995, however, the DBMR is still 110.6. So maybe some people who would have gotten married at the end of 1999 waited till New Years day or something, or rushed to marry on New Year’s Eve 2000, but that’s not the issue.
Maybe this has to do with who is answering the survey. Do you know what year your parents got married? If you answered the survey for your household, and someone else lives with you, you might round off. This is worth pursuing. I restricted the sample to just those who were householders (the person in whose name the home is owned or rented), and still got a DBMR of 110.7. But that might not be the best test.
Another possibility is that people who started living together before they were married — which is most Americans these days — don’t answer YRMARR with their legal marriage date, but some rounded-off cohabitation date. I don’t know how to test that.
Updating a 2013 post with the 2016 General Social Survey. Not a lot of interpretation, just some facts.
The GSS has, since 1972, asked Americans:
If you were asked to use one of four names for your social class, which would you say you belong in: the lower class, the working class, the middle class, or the upper class?
The latest data release, for 2016, confirms what I noticed before: a big rise in the percentage of people describing themselves as “lower class” since the 2008 recession, from 5% to 9%. This is striking when you zoom in on it:
Now, looking at the trend in all four classes, it’s clear there has been a decline in the proportion of people calling themselves “middle class” — which hit its lowest level ever in the series, 41%:
Is this important? I don’t know. The most common tendency in sociology these days is to use measures of education (one’s own education, or one’s parents’) to indicate social class, which is generally thought of in material terms, rather than as an identity issue (or as a question of what people actually learn in school). Of course there are sociologists who study class identity issues, but as a survey item I bet it’s more likely to show up in political science research.
Of course the political salience of “working class” was heightened by the election in 2016 (although the phrase itself was more likely used as an adjective than a noun; the noun in American politics remains “working families,” a term I dislike). And by “working class” of course most people meant White working class. A Google search of the New York Times site for [“White working class” 2016] produces 1,050 hits; [“Black working class” 2016] yields 37. But Blacks are considerably more likely to identify as “working class,” and less likely to choose “middle class,” than are Whites. Here is the breakdown for 2016 (at the mean of controls for age and sex):
Of course that doesn’t account for common correlates of class identity; it’s just a description of the groups. I looked a little more closely at income. Here is how people report class identities by family income, this time at the mean of controls for age, sex, race/ethnicity, and marital status (partly to account for family size)*:
This shows that “working class” is most common among those in the $30,000-$50,000 range, and it dominates under $75,000, while “middle class” picks up most people over $75,000. Only people in the top bracket — and only a small proportion of them — identify as “upper class.”
I did a little checking to see what difference class makes on some common political issues. Regressions holding constant sex, race/ethnicity, education, income, marital status, and region, showed that working class people were more Democratic than middle class people (on a scale from strong Dem to strong Rep); but middle class people are more pro-choice, and also more likely to think “people can be trusted.” In similar models, class didn’t do much to explain confidence in organized labor, support for same-sex marriage, attidues toward taxes on the rich, the likelihood of owning a gun, political views (liberal v. conservative), or traditional gender attitudes. Still, I think it’s worth asking.
In summary, it’s interesting that the self-identified class structure may be shifting relatively rapidly, and the implications are to be determined.
* Code for the income regression, using the full GSS dataset through 2016, available here.
The most important thing is that Stephanie Coontz has written another very good, and very important, New York Times essay. It describes a “slippage” in support for gender equality among young people these days, and warns that without improved work-family policies, progress toward egalitarian family arrangements may be imperiled. The piece also announced a package of short papers in a Council on Contemporary Families symposium, which provided the supporting evidence. (This kind of work, incidentally, is why I’m a proud member, and board member, of CCF.) If you haven’t read Stephanie’s essay, I recommend reading it now, and if you forget to come back here that’s fine.
Anyway, an unfortunate confluence of events created some chaos after the piece came out. First, the NYTimes wrote a headline, “Do Millennial Men Want Stay-at-Home Wives?”, that emphasized only one piece of the evidence. It referred to a figure showing General Social Survey data on the trend in very young men and women (ages 18-25) disagreeing with the statement, “It is much better for everyone involved if the man is the achiever outside the home and the woman takes care of the home and family.” (That is the classic FEFAM question, to GSS fans, asked since 1977. I’ve used it myself, and it figures in the key analysis of stalled gender progress by Cotter, Hermsen, and Vanneman.)
This was the figure, showing a marked divergence between men and women:
The second event was the unfortunate timing: between the time Stephanie wrote the piece and the day it appeared, the General Social Survey released its 2016 round of data (it’s been running every two years). The survey is fickle. It’s very good quality and has many great demographic and attitude items running for 40 years, making it the best source for analyzing many social trends. But it’s not that big. In 2014 it had 2,867 respondents, of whom only 141 were ages 18-25. So it wasn’t surprising that the 2016 numbers were different from the 2014 numbers, but the scale of the blip was shocking, as reported independently by Emily Beam and Neal Caren. Here is what the updated trend looks like:
Yikes. As exciting as it is for survey analysts to see such a wild swing, it’s not what anyone wants to see the day after their NYTimes piece drops. We can’t know yet what happened, but on further inspection, at least we can say that it’s not limited to the youngest group and its small sample. Among men ages 26-54, the percentage disagreement with FEFAM also jumped, from 73.7 to 78.3 (women 26-54 were up one point).* In fact, 2014 may have been as big a blip as 2016, you just wouldn’t notice because it continued the trend.
Anyway, back for a minute to the main point. Joanna Pepin, who co-wrote one of the symposium pieces with David Cotter (and who is also an advisee of mine), has pointed out that the divergence between men and women is secondary to the main trend, which is the reversal of progress on FEFAM for both men and women since the mid-1990s. They used the Monitoring the Future survey, and find a big drop in FEFAM disagreement among high school seniors — regardless of gender. Here’s their key figure, with the FEFAM trend shown in green (their full paper is available on SocArXiv):
So that is the most important news: a big reversal among young adults on attitudes toward homemaker-breadwinner family arrangements.
Now, If you’ve now read Stephanie’s piece, and Joanna’s, and you’re back, here’s a little more on the minor kerfuffle that arose over the new data.
When to call a trend a trend
I don’t think Stephanie was wrong to use the GSS trend, although it might have been better to widen the age range, or pool the data over several years. The bigger problem was the headline selling that divergence as the main story, which it wasn’t in the grand scheme. (The fact that so many jumped on the story shows how good they are at headline writing.) But even that wasn’t really wrong, given the information they had. The Op-Ed staff checked the facts, and the facts were the facts. Until yesterday.
To confirm this, I ran some tests on the gender divergence in the data they used (I started with code that Neal shared; it’s at the bottom). I started at 1994, the last peak of the trend, to look for the divergence after that, which is what Stephanie referred to. First, here is what you get if you run a logistic model that controls for race/ethnicity and individual years of age (two things that changed over the last two decades), and enters the years individually in an interaction with gender (those are 95% confidence intervals).
If you stop at 2014, it looks like men are pulling away from women (in the direction of “traditional” attitudes), but it’s not definitive. And obviously 2016 is an issue. To help with the small samples, I ran a linear test of the year trend, that is, entering year as a continuous variable instead of individual years. I did it ended at 2014 and then through 2016. Here are the results:
In the 1994-2014 model, the Male*Year interaction is statistically significant at conventional levels, which in my opinion means it’s legit to say men were pulling away from women. Of course 2016 ruined that; if you had 2016 and didn’t use it, that would be really wrong. There are other ways to slice it, but at some point we have to call a trend a trend and deal with it. It was a reasonable decision. Of course, new data always comes along (until the last trend of all, whatever that is), no trend lasts forever; it’s just a shame when it comes along the next day. In addition, though I’m not showing it because it’s boring, if you didn’t disaggregate the trends by gender, you would also see a significant decline in FEFAM disagreement after 1994, which gets to Joanna’s point.
Anyway, score one for sociology Twitter. People came up with the data, shared code and results, and discussed interpretations. It got back to Stephanie and the NYTimes editors, and within a day they added an addendum to the original piece:
Update: After this article was posted, 2016 data from the General Social Survey became available, adding some nuance to this analysis. The latest numbers show a rebound in young men’s disagreement with the claim that male-breadwinner families are superior. The trend still confirms a rise in traditionalism among high school seniors and 18-to-25-year-olds, but the new data shows that this rise is no longer driven mainly by young men, as it was in the General Social Survey results from 1994 through 2014.
This is pretty much how it’s supposed to work. As the Car Guys used to say, if you never stall you’re wearing out your clutch (sorry, Millennials). If you never overshoot an analysis of trends you’re probably waiting too long to get the information out.
* Note: I originally accidentally described this as “over 25.”
I’ve been putting off writing this post because I wanted to do more justice both to the history of the Black-men-raping-White-women charge and the survey methods questions. Instead I’m just going to lay this here and hope it helps someone who is more engaged than I am at the moment. I’m sorry this post isn’t higher quality.
Obviously, this post includes extremely racist and misogynist content, which I am showing you to explain why it’s bad.
This is about this very racist meme, which is extremely popular among extreme racists.
The modern racist uses statistics, data, and even math. They use citations. And I think it takes actually engaging with this stuff to stop it (this is untested, though, as I have no real evidence that facts help). That means anti-racists need to learn some demography and survey methods, and practice them in public. I was prompted to finally write on this by a David Duke video streamed on Facebook, in which he used exaggerated versions of these numbers, and the good Samaritans arguing with him did not really know how to respond.
For completely inadequate context: For a very long time, Black men raping White women has been White supremacists’ single favorite thing. This was the most common justification for lynching, and for many of the legal executions of Black men throughout the 20th century. From 1930 to 1994 there were 455 people executed for rape in the U.S., and 89% of them were Black (from the 1996 Statistical Abstract):
For some people, this is all they need to know about how bad the problem of Blacks raping Whites is. For better informed people, it’s the basis for a great lesson in how the actions of the justice system are not good measures of the crimes it’s supposed to address.
Good data gone wrong
Which is one reason the government collects the National Crime Victimization Survey (NCVS), a large sample survey of about 90,000 households with 160,000 people. In it they ask about crimes against the people surveyed, and the answers the survey yields are usually pretty different from what’s in the crime report statistics – and even further from the statistics on things like convictions and incarceration. It’s supposed to be a survey of crime as experienced, not as reported or punished.
It’s an important survey that yields a lot of good information. But in this case the Bureau of Justice Statistics is doing a serious disservice in the way they are reporting the results, and they should do something about it. I hope they will consider it.
Like many surveys, the NCVS is weighted to produce estimates that are supposed to reflect the general population. In a nutshell, that means, for example, that they treat each of the 158,000 people (over age 12) covered in 2014 as about 1,700 people. So if one person said, “I was raped,” they would say, “1700 people in the US say they were raped.” This is how sampling works. In fact, they tweak it much more than that, to make the numbers add up according to population distributions of variables like age, sex, race, and region – and non-response, so that if a certain group (say Black women) has a low response rate, their responses get goosed even more. This is reasonable and good, but it requires care in reporting to the general public.
So, how is the Bureau of Justice Statistics’ (BJS) reporting method contributing to the racist meme above? The racists love to cite Table 42 of this report, which last came out for the 2008 survey. This is the source for David Duke’s rant, and the many, many memes about this. The results of Google image search gives you a sense of how many websites are distributing this:
Here is Table 42, with my explanation below:
What this shows is that, based on their sample, BJS extrapolates an estimate of 117,640 White women who say they were sexually assaulted, or threatened with sexual assault, in 2008 (in the red box). Of those, 16.4% described their assailant as Black (the blue highlight). That works out to 19,293 White women sexually assaulted or threatened by Black men in one year – White supremacists do math. In the 2005 version of the table these numbers were 111,490 and 33.6%, for 37,460 White women sexually assaulted or threatened by Black men, or:
Now, go back to the structure of the survey. If each respondent in the survey counts for about 1,700 people, then the survey in 2008 would have found 69 White women who were sexually assaulted or threatened, 11 of whom said their assailant was Black (117,640/1,700). Actually, though, we know it was less than 11, because the asterisk on the table takes you to the footnote below which says it was based on 10 or fewer sample cases. In comparison, the survey may have found 27 Black women who said they were sexually assaulted or threatened (46,580/1,700), none of whom said their attacker was White, which is why the second blue box shows 0.0. However, it actually looks like the weights are bigger for Black women, because the figure for the percentage assaulted or threatened by Black attackers, 74.8%, has the asterisk that indicates 10 or fewer cases. If there were 27 Black women in this category, then 74.8% of them would be 20. So this whole Black women victim sample might be as little as 13, with bigger weights applied (because, say, Black women had a lower response rate). If in fact Black women are just as likely to be attacked or assaulted by White men as the reverse, 16%, you might only expect 2 of those 13 to be White, and so finding a sample 0 is not very surprising. The actual weighting scheme is clearly much more complicated, and I don’t know the unweighted counts, as they are not reported here (and I didn’t analyze the individual-level data).
I can’t believe we’re talking about this. The most important bottom line is that the BJS should not report extrapolations to the whole population from samples this small. These population numbers should not be on this table. At best these numbers are estimated with very large standard errors. (Using a standard confident interval calculator, that 16% of White women, based on a sample of 69, yields a confidence interval of +/- 9%.) It’s irresponsible, and it’s inadvertently (I assume) feeding White supremacist propaganda.
Rape and sexual assault are very disturbingly common, although not as common as they were a few decades ago, by conventional measures. But it’s a big country, and I don’t doubt lots of Black men sexual assault or threaten White women, and that White men sexually assault or threaten Black women a lot, too – certainly more than never. If we knew the true numbers, they would be bad. But we don’t.
A couple more issues to consider. Most sexual assault happens within relationships, and Black women have interracial relationships at very low rates. In round numbers (based on marriages), 2% of White women are with Black men, and 5% of Black women are with White men, which – because of population sizes – means there are more than twice as many couples with Black-man/White-woman than the reverse. At very small sample sizes, this matters a lot. But we would expect there to be more Black-White rape than the reverse based on this pattern alone. Consider further that the NCVS is a household sample, which means that if any Black women are sexually assaulted by White men in prison, it wouldn’t be included. Based on a 2011-2012 survey of prison and jail inmates, 3,500 women per year are the victim of staff sexual misconduct, and Black women inmates were about 50% more likely to report this than White women. So I’m guessing the true number of Black women sexually assaulted by White men is somewhat greater than zero, and that’s just in prisons and jails.
The BJS seems to have stopped releasing this form of the report, with Table 42, maybe because of this kind of problem, which would be great. In that case they just need to put out a statement clarifying and correcting the old reports – which they should still do, because they are out there. (The more recent reports are skimpier, and don’t get into this much detail [e.g., 2014] – and their custom table tool doesn’t allow you to specify the perceived race of the offender).
So, next time you’re arguing with David Duke, the simplest response to this is that the numbers he’s talking about are based on very small samples, and the asterisk means he shouldn’t use the number. The racists won’t take your advice, but it’s good for everyone else to know.
I complained that her dissertation was not made public, despite being awarded the American Sociological Association’s dissertation prize. I proposed a rule change for the association, requiring that the winning dissertation be “publicly available through a suitable academic repository by the time of the ASA meeting at which the award is granted.” (The rule change is moving through the process.)
When her dissertation was released, I complained about the rationale for the delay.
My critique of the survey that was part of her research grew into a formal comment (PDF) submitted to American Sociological Review.
In this post I don’t have anything to add about Alice Goffman’s work. This is about what we can learn from this and other incidents to improve our social science and its contribution to the wider social discourse. As Goffman’s TED Talk passed 1 million views, we have had good conversations about replicability and transparency in research, and about ethics in ethnography. And of course about the impact of criminal justice system and over-policing on African Americans, the intended target of her work. This post is about how we deal with errors in our scholarly publishing.
My comment was rejected by the American Sociological Review.
You might not realize this, but unlike many scientific journals, except for “errata” notices, which are for typos and editing errors, ASR has no normal way of acknowledging or correcting errors in research. To my knowledge ASR has never retracted an article or published an editor’s note explaining how an article, or part of an article, is wrong. Instead, they publish Comments (and Replies). The Comments are submitted and reviewed anonymously by peer reviewers just like an article, and then if the Comment is accepted the original author responds (maybe followed by a rejoinder). It’s a cumbersome and often combative process, often mixing theoretical with methodological critiques. And it creates a very high hurdle to leap, and a long delay, before the journal can correct itself.
In this post I’ll briefly summarize my comment, then post the ASR editors’ decision letter and reviews.
Comment: Survey and ethnography
I wrote the comment about Goffman’s 2009 ASR article for accountability. The article turned out to be the first step toward a major book, so ASR played a gatekeeping role for a much wider reading audience, which is great. But then it should take responsibility to notify readers about errors in its pages.
My critique boiled down to these points:
The article describes the survey as including all households in the neighborhood, which is not the case, and used statistics from the survey to describe the neighborhood (its racial composition and rates of government assistance), which is not justified.
The survey includes some number (probably a lot) of men who did not live in the neighborhood, but who were described as “in residence” in the article, despite being “absent because they were in the military, at job training programs (like JobCorp), or away in jail, prison, drug rehab centers, or halfway houses.” There is no information about how or whether such men were contacted, or how the information about them was obtained (or how many in her sample were not actually “in residence”).
The survey results are incongruous with the description of the neighborhood in the text, and — when compared with data from other sources — describe an apparently anomalous social setting. She reported finding more than twice as many men (ages 18-30) per household as the Census Bureau reports from their American Community Survey of Black neighborhoods in Philadelphia (1.42 versus .60 per household). She reported that 39% of these men had warrants for violating probation or parole in the prior three years. Using some numbers from other sources on violation rates, that translates into between 65% and 79% of the young men in the neighborhood being on probation or parole — very high for a neighborhood described as “nice and quiet” and not “particularly dangerous or crime-ridden.”
None of this can be thoroughly evaluated because the reporting of the data and methodology for the survey were inadequate to replicate or even understand what was reported.
You can read my comment here in PDF. Since I aired it out on this blog before submitting it, making it about as anonymous as a lot of other peer-review submissions, I see no reason to shroud the process any further. The editors’ letter I received is signed by the current editors — Omar Lizardo, Rory McVeigh, and Sarah Mustillo — although I submitted the piece before they officially took over (the editors at the time of my submission were Larry W. Isaac and Holly J. McCammon). The reviewers are of course anonymous. My final comment is at the end.
ASR letter and reviews
Dear Prof. Cohen:
The reviews are in on your manuscript, “Survey and ethnography: Comment on Goffman’s ‘On the Run’.” After careful reading and consideration, we have decided not to accept your manuscript for publication in American Sociological Review (ASR). Our decision is based on the reviewers’ comments, our reading of the manuscript, an overall assessment of the significance of the contribution of the manuscript to sociological knowledge, and an estimate of the likelihood of a successful revision.
As you will see, there was a range of opinions among the reviewers of your submission. Reviewer 1 feels strongly that the comment should not be published, reviewer 3 feels strongly that it should be published, and reviewer 2 falls in between. That reviewer sees merit in the criticisms but also suggests that the author’s arguments seem overstated in places and stray at times from discussion that is directly relevant to a critique of the original article’s alleged shortcomings.
As editors of the journal, we feel it is essential that we focus on the comment’s critique of the original ASR article (which was published in 2009), rather than the recently published book or controversy and debate that is not directly related to the submitted comment. We must consider not only the merits of the arguments and evidence in the submitted comment, but also whether the comment is important enough to occupy space that could otherwise be used for publishing new research. With these factors in mind, we feel that the main result that would come from publishing the comment would be that valuable space in the journal would be devoted to making a point that Goffman has already acknowledged elsewhere (that she did not employ probability sampling).
As the author of the comment acknowledges, there is actually very little discussion of, or use of, the survey data in Goffman’s article. We feel that the crux of the argument (about the survey) rests on a single sentence found on page 342 of the original article: “The five blocks known as 6th street are 93 percent Black, according to a survey of residents that Chuck and I conducted in 2007.” The comment author is interpreting that to mean that Goffman is claiming she conducted scientific probability sampling (with all households in the defined space as the sampling frame). It is important to note here that Goffman does not actually make that claim in the article. It is something that some readers might infer. But we are quite sure that many other readers simply assumed that this is based on nonprobability sampling or convenience sampling. Goffman speaks of it as a survey she conducted when she was an undergraduate student with one of the young men from the neighborhood. Given that description of the survey, we expect many readers assumed it was a convenience sample rather than a well-designed probability sample. Would it have been better if Goffman had made that more explicit in the original article? Yes.
In hindsight, it seems safe to say that most scholars (probably including Goffman) would say that the brief mentions of the survey data should have been excluded from the article. In part, this is because the reported survey findings play such a minor role in the contribution that the paper aims to make.
We truly appreciate the opportunity to review your manuscript, and hope that you will continue to think of ASR for your future research.
Omar Lizardo, Rory McVeigh, and Sarah Mustillo
Editors, American Sociological Review
This paper seeks to provide a critique of the survey data employed in Goffman (2009). Drawing on evidence from the American Community Survey, the author argues that data presented in Goffman (2009) about the community in which she conducted her ethnography is suspect. The author draws attention to remarkably high numbers of men living in households (compared with estimates derived from ACS data) and what s/he calls an “extremely high number” of outstanding warrants reported by Goffman. S/he raises the concern that Goffman (2009) did not provide readers with enough information about the survey and its methodology for them to independently evaluate its merits and thus, ultimately, calls into question the generalizability of Goffman’s survey results.
This paper joins a chorus of critiques of Goffman’s (2009) research and subsequent book. This critique is novel in that the critique is focused on the survey aspect of the research rather than on Goffman’s persona or an expressed disbelief of or distaste for her research findings (although that could certainly be an implication of this critique).
I will not comment on the reliability, validity or generalizability of Goffman’s (2009) evidence, but I believe this paper is fundamentally flawed. There are two key problems with this paper. First the core argument of the paper (critique) is inadequately situated in relation to previous research and theory. Second, the argument is insufficiently supported by empirical evidence.
The framing of the paper is not aligned with the core empirical aims of the paper. I’m not exactly sure what to recommend here because it seems as if this is written for a more general audience and not a sociological one. It strikes me as unusual, if not odd, to reference the popularity of a paper as a motivation for its critique. Whether or not Goffman’s work is widely cited in sociological or other circles is irrelevant for this or any other critique of the work. All social science research should be held to the same standards and each piece of scholarship should be evaluated on its own merits.
I would recommend that the author better align the framing of the paper with its empirical punchline. In my reading the core criticism of this paper is that the Goffman (2009) has not provided sufficient information for someone to replicate or validate her results using existing survey data. Although it may be less flashy, it seems more appropriate to frame the paper around how to evaluate social science research. I’d advise the author to tone down the moralizing and discussion of ethics. If one is to levy such a strong (and strongly worded) critique, one needs to root it firmly in established methods of social science.
That leads to the second, and perhaps even more fundamental, flaw. If one is to levy such a strong (and strongly worded) critique, one needs to provide adequate empirical evidence to substantiate her/his claims. Existing survey data from the ACS are not designed to address the kinds of questions Goffman engages in the paper and thus it is not appropriate for evaluating the reliability or validity of her survey research. Numerous studies have established that large scale surveys like the ACS under-enumerate black men living in cities. They fall into the “hard-to-reach” population that evade survey takers and census enumerators. Survey researchers widely acknowledge this problem and Goffman’s research, rather than resolving the issue, raises important questions about the extent to which the criminal justice system may contribute to difficulties for conventional social science research data collection methods. Perhaps the author can adopt a different, more scholarly, less authoritative, approach and turn the inconsistencies between her/his findings with the ACS and Goffman’s survey findings into a puzzle. How can these two surveys generate such inconsistent findings?
Just like any survey, the ACS has many strengths. But, the ACS is not well-suited to construct small area estimates of hard-to-reach populations. The author’s attempt to do so is laudable but the simplicity of her/his analysis trivializes the difficultly in reaching some of the most disadvantaged segments of the population in conventional survey research. It also trivializes one of the key insights of Goffman’s work and one that has been established previously and replicated by others: criminal justice contact fundamentally upends social relationships and living arrangements.
Furthermore, the ACS doesn’t ask any questions about criminal justice contact in a way that can help establish the validity of results for disadvantaged segments of the population who are most at-risk of criminal justice contact. It is impossible to determine using the ACS how many men (or women) in the United States, Pennsylvania, or Philadelphia (or any neighborhood therein), have an outstanding warrant. The ACS doesn’t ask about criminal justice contact, it doesn’t ask about outstanding warrants, and it isn’t designed to tap into the transient experiences of many people who have had criminal justice contact. The author provides no data to evaluate the validity of Goffman’s claims about outstanding warrants. Advancements in social science cannot be established from a “she said”, “he said” debate (e.g., FN 9-10). That kind of argument risks a kind of intellectual policing that is antithetical to established standards of evaluating social science research. That being said, someone should collect this evidence or at a minimum estimate, using indirect estimation methods, what fraction of different socio-demographic groups have outstanding warrants.
Although I believe that this paper is fundamentally flawed both in its framing and provision of evidence, I would like to encourage the author to replicate Goffman’s research. That could involve an extended ethnography in a disadvantaged neighborhood in Philadelphia or another similar city. That could also involve conducting a small area survey of a disadvantaged, predominantly black, neighborhood in a city with similar criminal justice policies and practices as Philadelphia in the period of Goffman’s study. This kind of research is painstaking, time consuming, and sorely needed exactly because surveys like the ACS don’t – and can’t – adequately describe or explain social life among the most disadvantaged who are most likely to be missing from such surveys.
I read this manuscript several times. It is more than a comment, it seems. It is 1) a critique of the description of survey methods in GASR and 2) a request for some action from ASR “to acknowledge errors when they occur.” The errors here have to do with Goffman’s description of survey methods in GASR, which the author describes in detail. This dual focus read as distracting at times. The manuscript would benefit from a more squarely focused critique of the description of survey methods in GASR.
Still, the author’s comment raises some valid concerns. The author’s primary concern is that the survey Goffman references in her 2009 ASR article is not described in enough detail to assess its accuracy or usefulness to a community of scholars. The author argues that some clarification is needed to properly understand the claims made in the book regarding the prevalence of men “on the run” and the degree to which the experience of the small group of men followed closely by Goffman is representative of most poor, Black men in segregated inner city communities. The author also cites a recent publication in which Goffman claims that the description provided in ASR is erroneous. If this is the case, it seems prudent for ASR to not only consider the author’s comments, but also to provide Goffman with an opportunity to correct the record.
I am not an expert in survey methods, but there are moments where the author’s interpretation of Goffman’s description seems overstated, which weakens the critique. For example, the author claims that Goffman is arguing that the entirety of the experience of the 6th Street crew is representative of the entire neighborhood, which is not necessarily what I gather from a close reading of GASR (although it may certainly be what has been taken up in popular discourse on the book). While there is overlap of the experience of being “on the run,” namely, your life is constrained in ways that it isn’t for those not on the run, it does appear that Goffman also uses the survey to describe a population that is distinct in important ways from the young men she followed on 6th street. The latter group has been “charged for more serious offenses like drugs and violent crimes,” she writes (this is the group that Sharkey argues might need to be “on the run”), while the larger group of men, whose information was gathered using survey data, were typically dealing with “more minor infractions”: “In the 6th Street neighborhood, a person was occasionally ‘on the run’ because he was a suspect in a shooting or robbery, but most people around 6th street had warrants out for far more minor infractions [emphasis mine].”
So, as I read it (I’ve also read the book), there are two groups: one “on the run” as a consequence of serious offenses and others “on the run” as a consequence of minor infractions. The consequence of being “on the run” is similar, even if the reason one is “on the run” varies.
The questions that remain are questions of prevalence and generalizability. The author asks: How many men in the neighborhood are “on the run” (for any reason)? How similar is this neighborhood to other neighborhoods? Answers to this question do rely on an accurate description of survey methods and data, as the author suggests.
This leads us to the most pressing and clearly argued question from the author: What is the survey population? Is it 1) “people around 6th Street” who also reside in the 6th Street neighborhood (of which, based on Goffman’s definition of in residence, are distributed across 217 distinct households in the neighborhood, however the neighborhood is defined e.g., 5 blocks or 6 blocks) or 2) the entirety of the neighborhood, which is made up of 217 households. It appears from the explanation from Goffman cited by the author that it is the former (“of the 217 households we interviewed,” which should probably read, of the 308 men we interviewed, all of whom reside in the neighborhood (based on Goffman’s definition of residence), 144 had a warrant…). Either way, the author makes a strong case for the need for clarification of this point.
The author goes on to explain the consequences of not accurately distinguishing among the two possibilities described above (or some other), but it seems like a good first step would be to request a clarification (the author could do this directly) and to allow more space than is allowed in a newspaper article to provide the type of explanation that could address the concerns of the author.
Is this the purpose of the comment? Or is the purpose of the comment merely to place a critique on record? The primary objective is not entirely clear in the present manuscript.
The author’s comment is strong enough to encourage ASR to think through possibilities for correcting the record. As a critique of the survey methods, the comment would benefit from more focus. The comment could also do a better job of contextualizing or comparing/contrasting the use of survey methods in GASR with other ethnographic studies that incorporate survey methods (at the moment such references appear in footnotes).
This comment exposes major errors in the survey methodology for Goffman’s article. One major flaw is that the goffman article describes the survey as inclusive of all households in the neighborhood but later, in press interviews, discloses that it is not representative of all households in the neighborhood. Another flaw that the author exposes is goffman’s data and methodological reporting not being up to par to sociological standards. Finally, the author argues that the data from the survey does not match the ethnographic data.
Overall, I agree with the authors assertions that the survey component is flawed. This is an important point because the article claims a large component of its substance from the survey instrument. The survey helped goffman to bolster generalizability , and arguably, garner worthiness of publication in ASR. If the massive errors in the survey had been exposed early on it is possible that ASR might have held back on publishing this article.
I am in agreement that ASR should correct the error highlighted on page 4 that the data set is not of the entire neighborhood but of random households/individuals given the survey in an informal way and that the sampling strategy should be described. Goffman should aknowledge that this was a non-representative convenience sample, used for bolstering field observations. It would follow then that the survey component of the ASR article would have to be rendered invalid and that only the field data in the article should be taken at face value. Goffman should also be asked to provide a commentary on her survey methodology.
The author points out some compelling anomalies from the goffman survey and general social survey data and other representative data. At best, goffman made serious mistakes with the survey and needs to be asked to show those mistakes and her survey methodology or she made up some of the data in the survey and appropriate action must be taken by ASR. I agree with the authors final assessment, that the survey results be disregarded and the article be republished without mention of such results or with mention of the results albeit showing all of its errors and demonstrating the survey methodology.
Regular readers can probably imagine my long, overblown, hyperventilating response to Reviewer 1, so I’ll just leave that to your imagination. On the bottom line, I disagree with the editors’ decision, but I can’t really blame them. Would it really be worth some number of pages in the journal, plus a reply and rejoinder, to hash this out? Within the constraints of the ASR format, maybe the pages aren’t worth it. And the result would not have been a definitive statement anyway, but rather just another debate among sociologists.
What else could they have done? Maybe it would have been better if the editors could simply append a note to the article advising readers that the survey is not accurately described, and cautioning against interpreting it as representative — with a link to the comment online somewhere explaining the problem. (Even so of course Goffman should have a chance to respond, and so on.)
It’s just wrong that now the editors acknowledge there is something wrong in their journal — although we seem to disagree about how serious the problem is — but no one is going to formally notify the future readers of the article. That seems like bad scholarly communication. I’ve said from the beginning that there’s no need for a high-volume conversation about this, or attack on anyone’s integrity or motives. There are important things in this research, and it’s also highly flawed. Acknowledge the errors — so they don’t compound — and move on.
This incident can help us learn lessons with implications up and down the publishing system. Here are a couple. At the level of social science research reporting: don’t publish survey results data without sufficient methodological documentation — let’s have the instrument and protocol, the code, and access to the data. At the system level of publishing, why do we still have journals with cost-defined page limits? Because for-profit publishing is more important than scholarly communication. The sooner we get out from under that 19th-century habit the better.
It would be great to know more about everything, but if you ask just these five questions of enough people, you can learn an awful lot about marriage and divorce.
First the questions, then some data. These are the question wordings from the 2013 American Community Survey (ACS).
1. What is Person X’s age?
We’ll just take the people who are ages 15 to 59, but that’s optional.
2. What is this person’s marital status?
Surprisingly, we don’t want to know if they’re divorced, just if they’re currently married (I include people are are separated and those who live apart from their spouses for other reasons). This is the denominator in your basic “refined divorce rate,” or divorces per 1000 married people.
3. In the past 12 months, did this person get divorced?
The number of people who got divorced in the last year is the numerator in your refined divorce rate. According to the ACS in 2013 (using population weights to scale the estimates up to the whole population), there were 127,571,069 married people, and 2,268,373 of them got divorced, so the refined divorce rate was 17.8 per 1,000 married people. When I analyze who got divorced, I’m going to mix all the currently-married and just-divorced people together, and then treat the divorces as an event, asking, who just got divorced?
4. In what year did this person last get married?
This is crucial for estimating divorce rates according to marriage duration. When you subtract this from the current year, that’s how long they are (or were) married. When you subtract the marriage duration from age, you get the age at marriage. (For example, a person who is 40 years old in 2013, who last got married in 2003, has a marriage duration of 10 years, and an age at marriage of 30.)
5. How many times has this person been married?
I use this to narrow our analysis down to women in their first marriages, which is a conventional way of simplifying the analysis, but that’s optional.
I restrict the analysis below to women, which is just a sexist convention for simplifying things (since men and women do things at different ages).*
So here are the 375,249 women in the 2013 ACS public use file, ages 16-59, who were in their first marriages, or just divorced from their first marriages, by their age at marriage and marriage duration. Add the two numbers together and you get their current age. The colors let you see the basic distribution (click to enlarge):
The most populous cell on the table is 28-year-olds who got married three years ago, at age 25, with 1068 people. The least populous is 19-year-olds who got married at 15 (just 14 of them). The diagonal edge reflects my arbitrary cutoff at age 59.
Now, in each of these cells there are married people, and (in most of them) people who just got divorced. The ratio between those two frequencies is a divorce rate — one specific to the age at marriage and marriage duration. To make the next figure I used three years of ACS data (2011-2013) so the results would be smoother. (And then I smoothed it more by replacing each cell with an average of itself and the adjoining cells.) These are the divorce rates by age at marriage and years married (click to enlarge):
The overall pattern here is more green, or lower divorce rates, to the right (longer duration of marriage) and down (older age at marriage). So the big red patch is the first 12 years for marriages begun before the woman was age 25. And after about 25 years of marriage it’s pretty much green, for low divorce rates. The high contrast at the bottom left implies an interesting high risk but steep decline in the first few years after marriage for these late marriages. This matrix adds nuance to the pattern I reported the other day, which featured a little bump up in divorce odds for people who married in their late thirties. From this figure it looks like marriages that start after the woman is about 35 might have less of a honeymoon period than those beginning about age 24-33.
To learn more, I go beyond those five great questions, and use a regression model (same as the other day), with a (collapsed) marriage-age–by–marriage-duration matrix. So these are predicted divorce rates per 1000, holding education, race/ethnicity, and nativity constant (click to enlarge)**:
The controls cut down the late-thirties bump and isolate it mostly to the first year. This also shows that the punishing first year is an issue for all ages over 35. The late thirties just showed the bump because that group doesn’t have the big drop in divorce after the first year that the later years do. Interesting!
Here’s where the awesome data let us down. This data is very powerful. It’s the best contemporary big data set we have for analyzing divorce. It has taken us this far, but it can’t explain a pattern like this.
We can control for education, but that’s just the education level at the time of the most recent survey. We can’t know when she got her education relative to the dates of her marriage. Further, from the ACS we can’t tell how many children a person has had, with whom, and when — we only know about children who happen to be living in the household in 2013, so a 50-year-old could be childfree or have raised and released four kids already. And about couples, although we can say things about the other spouse from looking around in the household (such as his age, race, and income), if someone has divorced the spouse is gone and there is no information about that person (even their sex). So we can’t use that information to build a model of divorce predictors.
Here’s an example of what we can only hint at. Remarriages are more likely to end in divorce, for a variety of reasons, which is why we simplify these things by only looking at first marriages. But what about the spouse? Some of these women are married to men who’ve been married before. I can’t how much that contributes to their likelihood of divorce, but it almost certainly does. Think about the bump up in the divorce rate for women who got married in their late thirties. On the way from high divorce rates for women who marry early to low rates for women who marry late, the overall downward slope reflects increasing maturity and independence for women, but it’s running against the pressure of their increasingly complicated relationship situations. That late-thirties bump may have to do with the likelihood that their husbands have been married before. Here’s the circumstantial evidence:
See that big jump from early-thirties to late-thirties? All of a sudden 37.5% of women marrying in their late-thirties are marrying men who are remarrying. That’s a substantial risk factor for divorce, and one I can’t account for in my analysis (because we don’t have spouse information for divorced women).
Divorce is complicated and inherently longitudinal. Marriages arise out of specific contexts and thrive or decay in many different ways. Yesterday’s crucial influence may disappear today. So how can we say anything about divorce using a single, cross-sectional survey sample? The unsatisfying answer is that all analysis is partial. But these five questions give us a lot to go on, because knowing when a person got married allows us to develop a multidimensional image of the events, as I’ve demonstrated here.
But, you ask, what can we learn from, say, the divorce propensity of today’s 40-year-olds when we know that just last year a whole bunch of 39-year-olds divorced, skewing today’s sample? This is a real issue. And demography provides an answer that is at once partial and powerful: Simple, we use today’s 39-year-olds, too. In the purest form, this approach gives us the life table, in which one year’s mortality rates — at every age — lead to a projection of life expectancy. Another common application is the total fertility rate (watch the video!), which sums birth rates by age to project total births for a generation. In this case I have not produced a complete divorce life table (which I promised a while ago — it’s coming). But the approach is similar.
These are all synthetic cohort approaches (described nicely in the Week 6 lecture slides from this excellent Steven Ruggles course). In this case, the cohorts are age-at-marriage groups. Look at the table above and follow the row for, say, marriages that started at age 28, to see that synthetic cohort’s divorce experience from marriage until age 59. It’s neither a perfect depiction of the past, nor a foolproof prediction of the future. Rather, it tells us what’s happening now in cohort terms that are readily interpretable.
The ACS is the best thing we have for understanding the basic contours of divorce trends and patterns. Those five questions are invaluable.
* For this I also tossed the people who were reported to have married in the current year, because I wasn’t sure about the timing of their marriages and divorces, but I put them back in for the regressions.
** The codebook for my IPUMS data extraction is here, my Stata code is here. The heat-map model here isn’t in that code file, but this these are the commands (and the margins command took a very long time, so please don’t tell me there’s something wrong with it):
[SKIP TO THE END for a mystery-partly-solved addendum]
Normally when we teach demography we use population pyramids, which show how much of a population is found at each age. They’re great tools for visualizing population distributions and discussing projections of growth and decline. For example, consider this contrast between Niger and Japan, about as different as we get on earth these days (from this cool site):
It’s pretty easy to see the potential for population growth versus decline in these patterns. Finding good pyramids these days is easy, but it’s still good to make some yourself to get a feel for how they work.
So, thinking I might make a video lesson to follow up my blockbuster total fertility rate performance, I gathered some data from the U.S., using the 2013 American Community Survey (ACS) from IPUMS.org. I started with 10-year bins and the total population (not broken out by sex), which looks like this:
There’s the late Baby Boom, still bulging out at ages 50-59 (born 1954-1963), and their kids, ages 20-29. So far so good. But why not use single years of age and show something more precise? Here’s the same data, but showing single years of age:
That’s more fine-grained. Not as much as if you had data by months or days of birth, but still. Except, wait: is that just sample noise causing that ragged edge between 20 and about 70? The ACS sample is a few million people, with tens of thousands of people at each age (up age 75, at least), so you wouldn’t expect too much of that. No, it’s definitely age heaping, the tendency of people to skew their age reporting according to some collective cognitive scheme. The most common form is piling up on the ages ending with 0 and 5, but it could be anything. For example, some people might want to be 18, a socially significant milestone in this country. Here’s the same data, with suspect ages highlighted — 0’s and 5’s from 20 to 80, and 18:
You might think age heaping results from some old people not remembering how old they are. In the old days rounding off was more common at older ages. In 1900, for example, the most implausible number of people was found at age 60 — 1.6-times as many as you’d get by averaging the number of people at ages 59 and 61. Is that still the case? Here it is again, but with the red/green highlights just showing the difference between the number of people reported and the number you’d get by averaging the numbers just above and below:
Proportionately, the 70-year-olds are most suspicious, at 10.8% more than you’d expect. But 40 is next, at 9.2%. And that green line shows extra 18-year-olds at 8.6% more than expected.
Unfortunately, it’s pretty hard to correct. Interestingly, the American Community Survey apparently asks for both an age and a birth date:
If you’re the kind of person who rounds off to 70, or promotes yourself to 18, it might not be worth the trouble to actually enter a fake birth date. I’m sure the Census Bureau does something with that, like correct obvious errors, but I don’t think they attempt to correct age-heaping in the ACS (the birth dates aren’t on the public use files). Anyway, we can see a little of the social process by looking at different groups of people.
Up till now I’ve been using the full public use data, with population weights, and including those people who left age blank or entered something implausible enough that the Census Bureau gave them an age (an “allocated” value, in survey parlance). For this I just used the unweighted counts of people whose answers were accepted “as written” (or typed, or spoken over the phone, depending on how it was administered to them). Here are the patterns for people who didn’t finish high school versus those with a bachelor’s degree or higher, highlighting the 5’s and 0’s (click to enlarge):
Clearly, the age heaping is more common among those with less education. Whether it’s really people forgetting their age, rounding up or down for aspirational reasons, or having trouble with the survey administration, I don’t know.
Is this bad? As much as we all hate inaccuracy, this isn’t so bad. Fortunately, demographers have methods for assessing the damage caused by humans and their survey-taking foibles. In this case we can use Whipple’s index. This measure (defined in this handy United Nations slideshow) takes the number of people whose alleged ages end in 0 or 5 and multiplies that by 5, then compares it to the total population. Normally people use ages 23 to 62 (inclusive), for an even 40 years. The amount by which people reporting ages 25, 30, 35, 40, 45, 50, 55, and 60 are more than one-fifth of the population ages 23-62, that’s your Whipple’s index. A score of 100 is perfect, and a score of 500 means everyone’s heaped. The U.N. considers scores under 105 to be “very accurate data.” The 2013 ACS, using the public use file and the weights, gives me a score of 104.3. (Those unweighted distributions by education yield scores of 104.0 for high school dropouts and 101.7 for college graduates.) In contrast, the Decennial Census in 2010 had a score of just 101.5 by my calculation (using table QT-P2 from Summary File 1). With the size of the ACS, this difference shouldn’t have to do with sampling variation. Rather, it’s something about the administration of the survey.
Why don’t they just tell us how old they really are? There must be a reason.
The age 18 pattern is interesting — I don’t find any research on desirable young-adult ages skewing sample surveys.
This is all very different from birth timing issues, such as the Chinese affinity for births in dragon years (every twelfth year: 1976, 1988…). I don’t see anything in the U.S. pattern that fits fluctuations in birth rates.
I focused one education above, but another explanation was staring me in the face. I said “it’s something about the administration of the survey,” but didn’t think to check for the form of survey people took. The public use files for ACS include an indicator of whether the household respondent took the survey through the mail (28%), on the web (39%), through a bureaucrat at the institution where they live (group quarters; 5%), or in an interview with a Census worker (28%). This last method, which is either a computer-assisted telephone interview (CATI) or computer-assisted personal interview (CAPI), is used when people don’t respond to the mailed survey.
It turns out that the entire Whipple problem in the 2013 ACS is due to the CATI/CAPI interviews. The age distributions for all of the other three methods have Whipple index scores below 100, while the CATI/CAPI folks clock in at a whopping 108.3. Here is that distribution, again using unweighted cases:
There they are, your Whipple participants. Who are they, and why does this happen? Here is the Bureau’s description of the survey data collection:
The data collection operation for housing units (HUs) consists of four modes: Internet, mail, telephone, and personal visit. For most HUs, the first phase includes a mailed request to respond via Internet, followed later by an option to complete a paper questionnaire and return it by mail. If no response is received by mail or Internet, the Census Bureau follows up with computer assisted telephone interviewing (CATI) when a telephone number is available. If the Census Bureau is unable to reach an occupant using CATI, or if the household refuses to participate, the address may be selected for computer-assisted personal interviewing (CAPI).
So the CATI/CAPI people are those who were either difficult to reach or were uncooperative when contacted. This group, incidentally, has low average education, as 63% have high school education or less (compared with 55% of the total) — which may explain the association with education. Maybe they have less accurate recall, or maybe they are less cooperative, which makes sense if they didn’t want to do the survey in the first place (which they are legally mandated — i.e., coerced — to do). So when their date of birth and age conflict, and the Census worker tries to elicit a correction, maybe all hell breaks lose in the interview and they can’t work it out. Or maybe the CATI/CAPI households have more people who don’t know each other’s exact ages (one person answers for the household). I don’t know. But this narrows it down considerably.
Jesse Singal at New York Magazine‘s Science of Us has a piece in which he tracks down and interviews a number of Alice Goffman’s respondents. This settles the question — which never should have been a real question — about whether she actually did all that deeply embedded ethnography in Philadelphia. It leaves completely unresolved, however, the issue of the errors and possible errors in the research. This reaffirms for me the conclusion in my original review that we should take the volume down in this discussion, identify errors in the research without trying to attack Goffman personally or delegitimize her career — and then learn from the affair ways that we can improve sociology (for example, by requiring that winners of the American Sociological Association dissertation award make their work publicly available).
That said, I want to comment on a couple of issues raised in Singal’s piece, and share my draft of a formal comment on the survey research Goffman reported in American Sociological Review.
First, I want to distance myself from the description by Singal of “lawyers and journalists and rival academics who all stand to benefit in various ways if they can show that On the Run doesn’t fully hold up.” I don’t see how I (or any other sociologists) benefit if Goffman’s research does not hold up. In fact, although some people think this is worth pursuing, I am also annoying some friends and colleagues by doing this.
More importantly, although it’s a small part of the article, Singal did ask Goffman about the critique of her survey, and her response (as he paraphrased it, anyway) was not satisfying to me:
Philip Cohen, a sociologist at the University of Maryland, published a blog post in which he puzzles over the strange results of a door-to-door survey Goffman says she conducted with Chuck in 2007 in On the Run. The results are implausible in a number of ways. But Goffman explained to me that this wasn’t a regular survey; it was an ethnographic survey, which involves different sampling methods and different definitions of who is and isn’t in a household. The whole point, she said, was to capture people who are rendered invisible by traditional survey methods. (Goffman said an error in the American Sociological Review paper that became On the Run is causing some of the confusion — a reference to “the 217 households that make up the 6th Street neighborhood” that should have read “the 217 households that we interviewed … ” [emphasis mine]. It’s a fix that addresses some of Cohen’s concerns, like an implied and very unlikely 100 percent response rate, but not all of them.) “I should have included a second appendix on the survey in the book,” said Goffman. “If I could do it over again, I would.”
My responses are several. First, the error of describing the 217 households as the whole neighborhood, as well as the error in the book of saying she interviewed all 308 men (when in the ASR article she reports some unknown number were absent), both go in the direction of inflating the value and quality of the survey. Maybe they are random errors, but they didn’t have a random effect.
Second, I don’t see a difference between a “regular survey” and an “ethnographic survey.” There are different survey techniques for different applications, and the techniques used determine the data and conclusions that follow. For example, in the ASR article Goffman uses the survey (rather than Census data) to report the racial composition of the neighborhood, which is not something you can do with a convenience sample, regardless of whether you are engaged in an ethnography or not.
Finally, there are no people “rendered invisible by traditional survey methods” (presumably Singal’s phrase). There are surveys that are better or worse at including people in different situations. There are “traditional” surveys — of varying quality — of homeless people, prisoners, rape victims, and illiterate peasants. I don’t know what an “ethnographic survey” is, but I don’t see why it shouldn’t include a sampling strategy, a response rate, a survey instrument, a data sharing arrangement, and thorough documentation of procedures. That second methodological appendix can be published at any time.
ASR Comment (revised June 22)
I wrote up my relatively narrow, but serious, concerns about the survey, and posted them on SocArXiv, here.
It strikes me that Goffman’s book (either the University of Chicago Press version or the trade book version) may not be subject to the same level of scrutiny that her article in ASR should have been. In fact, presumably, the book publishers took her publication in ASR as evidence of the work’s quality. And their interests are different from those of a scientific journal run by an academic society. If ASR is going to play that gatekeeping role, and it should, then ASR (and by extension ASA) should take responsibility in print for errors in its publications.