Tag Archives: survey methods

Black men raping White women: BJS’s Table 42 problem

I’ve been putting off writing this post because I wanted to do more justice both to the history of the Black-men-raping-White-women charge and the survey methods questions. Instead I’m just going to lay this here and hope it helps someone who is more engaged than I am at the moment. I’m sorry this post isn’t higher quality.

Obviously, this post includes extremely racist and misogynist content, which I am showing you to explain why it’s bad.

This is about this very racist meme, which is extremely popular among extreme racists.


The modern racist uses statistics, data, and even math. They use citations. And I think it takes actually engaging with this stuff to stop it (this is untested, though, as I have no real evidence that facts help). That means anti-racists need to learn some demography and survey methods, and practice them in public. I was prompted to finally write on this by a David Duke video streamed on Facebook, in which he used exaggerated versions of these numbers, and the good Samaritans arguing with him did not really know how to respond.

For completely inadequate context: For a very long time, Black men raping White women has been White supremacists’ single favorite thing. This was the most common justification for lynching, and for many of the legal executions of Black men throughout the 20th century. From 1930 to 1994 there were 455 people executed for rape in the U.S., and 89% of them were Black (from the 1996 Statistical Abstract):


For some people, this is all they need to know about how bad the problem of Blacks raping Whites is. For better informed people, it’s the basis for a great lesson in how the actions of the justice system are not good measures of the crimes it’s supposed to address.

Good data gone wrong

Which is one reason the government collects the National Crime Victimization Survey (NCVS), a large sample survey of about 90,000 households with 160,000 people. In it they ask about crimes against the people surveyed, and the answers the survey yields are usually pretty different from what’s in the crime report statistics – and even further from the statistics on things like convictions and incarceration. It’s supposed to be a survey of crime as experienced, not as reported or punished.

It’s an important survey that yields a lot of good information. But in this case the Bureau of Justice Statistics is doing a serious disservice in the way they are reporting the results, and they should do something about it. I hope they will consider it.

Like many surveys, the NCVS is weighted to produce estimates that are supposed to reflect the general population. In a nutshell, that means, for example, that they treat each of the 158,000 people (over age 12) covered in 2014 as about 1,700 people. So if one person said, “I was raped,” they would say, “1700 people in the US say they were raped.” This is how sampling works. In fact, they tweak it much more than that, to make the numbers add up according to population distributions of variables like age, sex, race, and region – and non-response, so that if a certain group (say Black women) has a low response rate, their responses get goosed even more. This is reasonable and good, but it requires care in reporting to the general public.

So, how is the Bureau of Justice Statistics’ (BJS) reporting method contributing to the racist meme above? The racists love to cite Table 42 of this report, which last came out for the 2008 survey. This is the source for David Duke’s rant, and the many, many memes about this. The results of Google image search gives you a sense of how many websites are distributing this:


Here is Table 42, with my explanation below:


What this shows is that, based on their sample, BJS extrapolates an estimate of 117,640 White women who say they were sexually assaulted, or threatened with sexual assault, in 2008 (in the red box). Of those, 16.4% described their assailant as Black (the blue highlight). That works out to 19,293 White women sexually assaulted or threatened by Black men in one year – White supremacists do math. In the 2005 version of the table these numbers were 111,490 and 33.6%, for 37,460 White women sexually assaulted or threatened by Black men, or:


Now, go back to the structure of the survey. If each respondent in the survey counts for about 1,700 people, then the survey in 2008 would have found 69 White women who were sexually assaulted or threatened, 11 of whom said their assailant was Black (117,640/1,700). Actually, though, we know it was less than 11, because the asterisk on the table takes you to the footnote below which says it was based on 10 or fewer sample cases. In comparison, the survey may have found 27 Black women who said they were sexually assaulted or threatened (46,580/1,700), none of whom said their attacker was White, which is why the second blue box shows 0.0. However, it actually looks like the weights are bigger for Black women, because the figure for the percentage assaulted or threatened by Black attackers, 74.8%, has the asterisk that indicates 10 or fewer cases. If there were 27 Black women in this category, then 74.8% of them would be 20. So this whole Black women victim sample might be as little as 13, with bigger weights applied (because, say, Black women had a lower response rate). If in fact Black women are just as likely to be attacked or assaulted by White men as the reverse, 16%, you might only expect 2 of those 13 to be White, and so finding a sample 0 is not very surprising. The actual weighting scheme is clearly much more complicated, and I don’t know the unweighted counts, as they are not reported here (and I didn’t analyze the individual-level data).

I can’t believe we’re talking about this. The most important bottom line is that the BJS should not report extrapolations to the whole population from samples this small. These population numbers should not be on this table. At best these numbers are estimated with very large standard errors. (Using a standard confident interval calculator, that 16% of White women, based on a sample of 69, yields a confidence interval of +/- 9%.) It’s irresponsible, and it’s inadvertently (I assume) feeding White supremacist propaganda.

Rape and sexual assault are very disturbingly common, although not as common as they were a few decades ago, by conventional measures. But it’s a big country, and I don’t doubt lots of Black men sexual assault or threaten White women, and that White men sexually assault or threaten Black women a lot, too – certainly more than never. If we knew the true numbers, they would be bad. But we don’t.

A couple more issues to consider. Most sexual assault happens within relationships, and Black women have interracial relationships at very low rates. In round numbers (based on marriages), 2% of White women are with Black men, and 5% of Black women are with White men, which – because of population sizes – means there are more than twice as many couples with Black-man/White-woman than the reverse. At very small sample sizes, this matters a lot. But we would expect there to be more Black-White rape than the reverse based on this pattern alone. Consider further that the NCVS is a household sample, which means that if any Black women are sexually assaulted by White men in prison, it wouldn’t be included. Based on a 2011-2012 survey of prison and jail inmates, 3,500 women per year are the victim of staff sexual misconduct, and Black women inmates were about 50% more likely to report this than White women. So I’m guessing the true number of Black women sexually assaulted by White men is somewhat greater than zero, and that’s just in prisons and jails.

The BJS seems to have stopped releasing this form of the report, with Table 42, maybe because of this kind of problem, which would be great. In that case they just need to put out a statement clarifying and correcting the old reports – which they should still do, because they are out there. (The more recent reports are skimpier, and don’t get into this much detail [e.g., 2014] – and their custom table tool doesn’t allow you to specify the perceived race of the offender).

So, next time you’re arguing with David Duke, the simplest response to this is that the numbers he’s talking about are based on very small samples, and the asterisk means he shouldn’t use the number. The racists won’t take your advice, but it’s good for everyone else to know.


Filed under In the news

Comment on Goffman’s survey, American Sociological Review rejection edition


Peer Review, by Gary Night. https://flic.kr/p/c2WH2E


  • I reviewed Alice Goffman’s book, On The Run.
  • I complained that her dissertation was not made public, despite being awarded the American Sociological Association’s dissertation prize. I proposed a rule change for the association, requiring that the winning dissertation be “publicly available through a suitable academic repository by the time of the ASA meeting at which the award is granted.” (The rule change is moving through the process.)
  • When her dissertation was released, I complained about the rationale for the delay.
  • My critique of the survey that was part of her research grew into a formal comment (PDF) submitted to American Sociological Review.

In this post I don’t have anything to add about Alice Goffman’s work. This is about what we can learn from this and other incidents to improve our social science and its contribution to the wider social discourse. As Goffman’s TED Talk passed 1 million views, we have had good conversations about replicability and transparency in research, and about ethics in ethnography. And of course about the impact of criminal justice system and over-policing on African Americans, the intended target of her work. This post is about how we deal with errors in our scholarly publishing.

My comment was rejected by the American Sociological Review.

You might not realize this, but unlike many scientific journals, except for “errata” notices, which are for typos and editing errors, ASR has no normal way of acknowledging or correcting errors in research. To my knowledge ASR has never retracted an article or published an editor’s note explaining how an article, or part of an article, is wrong. Instead, they publish Comments (and Replies). The Comments are submitted and reviewed anonymously by peer reviewers just like an article, and then if the Comment is accepted the original author responds (maybe followed by a rejoinder). It’s a cumbersome and often combative process, often mixing theoretical with methodological critiques. And it creates a very high hurdle to leap, and a long delay, before the journal can correct itself.

In this post I’ll briefly summarize my comment, then post the ASR editors’ decision letter and reviews.

Comment: Survey and ethnography

I wrote the comment about Goffman’s 2009 ASR article for accountability. The article turned out to be the first step toward a major book, so ASR played a gatekeeping role for a much wider reading audience, which is great. But then it should take responsibility to notify readers about errors in its pages.

My critique boiled down to these points:

  • The article describes the survey as including all households in the neighborhood, which is not the case, and used statistics from the survey to describe the neighborhood (its racial composition and rates of government assistance), which is not justified.
  • The survey includes some number (probably a lot) of men who did not live in the neighborhood, but who were described as “in residence” in the article, despite being “absent because they were in the military, at job training programs (like JobCorp), or away in jail, prison, drug rehab centers, or halfway houses.” There is no information about how or whether such men were contacted, or how the information about them was obtained (or how many in her sample were not actually “in residence”).
  • The survey results are incongruous with the description of the neighborhood in the text, and — when compared with data from other sources — describe an apparently anomalous social setting. She reported finding more than twice as many men (ages 18-30) per household as the Census Bureau reports from their American Community Survey of Black neighborhoods in Philadelphia (1.42 versus .60 per household). She reported that 39% of these men had warrants for violating probation or parole in the prior three years. Using some numbers from other sources on violation rates, that translates into between 65% and 79% of the young men in the neighborhood being on probation or parole — very high for a neighborhood described as “nice and quiet” and not “particularly dangerous or crime-ridden.”
  • None of this can be thoroughly evaluated because the reporting of the data and methodology for the survey were inadequate to replicate or even understand what was reported.

You can read my comment here in PDF. Since I aired it out on this blog before submitting it, making it about as anonymous as a lot of other peer-review submissions, I see no reason to shroud the process any further. The editors’ letter I received is signed by the current editors — Omar Lizardo, Rory McVeigh, and Sarah Mustillo — although I submitted the piece before they officially took over (the editors at the time of my submission were Larry W. Isaac and Holly J. McCammon). The reviewers are of course anonymous. My final comment is at the end.

ASR letter and reviews

Editors’ letter:


Dear Prof. Cohen:

The reviews are in on your manuscript, “Survey and ethnography: Comment on Goffman’s ‘On the Run’.” After careful reading and consideration, we have decided not to accept your manuscript for publication in American Sociological Review (ASR).  Our decision is based on the reviewers’ comments, our reading of the manuscript, an overall assessment of the significance of the contribution of the manuscript to sociological knowledge, and an estimate of the likelihood of a successful revision.

As you will see, there was a range of opinions among the reviewers of your submission.  Reviewer 1 feels strongly that the comment should not be published, reviewer 3 feels strongly that it should be published, and reviewer 2 falls in between.  That reviewer sees merit in the criticisms but also suggests that the author’s arguments seem overstated in places and stray at times from discussion that is directly relevant to a critique of the original article’s alleged shortcomings.

As editors of the journal, we feel it is essential that we focus on the comment’s critique of the original ASR article (which was published in 2009), rather than the recently published book or controversy and debate that is not directly related to the submitted comment.  We must consider not only the merits of the arguments and evidence in the submitted comment, but also whether the comment is important enough to occupy space that could otherwise be used for publishing new research.  With these factors in mind, we feel that the main result that would come from publishing the comment would be that valuable space in the journal would be devoted to making a point that Goffman has already acknowledged elsewhere (that she did not employ probability sampling).

As the author of the comment acknowledges, there is actually very little discussion of, or use of, the survey data in Goffman’s article.   We feel that the crux of the argument (about the survey) rests on a single sentence found on page 342 of the original article:  “The five blocks known as 6th street are 93 percent Black, according to a survey of residents that Chuck and I conducted in 2007.”  The comment author is interpreting that to mean that Goffman is claiming she conducted scientific probability sampling (with all households in the defined space as the sampling frame).  It is important to note here that Goffman does not actually make that claim in the article.  It is something that some readers might infer.  But we are quite sure that many other readers simply assumed that this is based on nonprobability sampling or convenience sampling.  Goffman speaks of it as a survey she conducted when she was an undergraduate student with one of the young men from the neighborhood.  Given that description of the survey, we expect many readers assumed it was a convenience sample rather than a well-designed probability sample.  Would it have been better if Goffman had made that more explicit in the original article?  Yes.

In hindsight, it seems safe to say that most scholars (probably including Goffman) would say that the brief mentions of the survey data should have been excluded from the article.  In part, this is because the reported survey findings play such a minor role in the contribution that the paper aims to make.

We truly appreciate the opportunity to review your manuscript, and hope that you will continue to think of ASR for your future research.


Omar Lizardo, Rory McVeigh, and Sarah Mustillo

Editors, American Sociological Review

Reviewer: 1

This paper seeks to provide a critique of the survey data employed in Goffman (2009).  Drawing on evidence from the American Community Survey, the author argues that data presented in Goffman (2009) about the community in which she conducted her ethnography is suspect.  The author draws attention to remarkably high numbers of men living in households (compared with estimates derived from ACS data) and what s/he calls an “extremely high number” of outstanding warrants reported by Goffman.  S/he raises the concern that Goffman (2009) did not provide readers with enough information about the survey and its methodology for them to independently evaluate its merits and thus, ultimately, calls into question the generalizability of Goffman’s survey results.

This paper joins a chorus of critiques of Goffman’s (2009) research and subsequent book.  This critique is novel in that the critique is focused on the survey aspect of the research rather than on Goffman’s persona or an expressed disbelief of or distaste for her research findings (although that could certainly be an implication of this critique).

I will not comment on the reliability, validity or generalizability of Goffman’s (2009) evidence, but I believe this paper is fundamentally flawed.  There are two key problems with this paper.  First the core argument of the paper (critique) is inadequately situated in relation to previous research and theory.  Second, the argument is insufficiently supported by empirical evidence.

The framing of the paper is not aligned with the core empirical aims of the paper.  I’m not exactly sure what to recommend here because it seems as if this is written for a more general audience and not a sociological one.  It strikes me as unusual, if not odd, to reference the popularity of a paper as a motivation for its critique.  Whether or not Goffman’s work is widely cited in sociological or other circles is irrelevant for this or any other critique of the work.  All social science research should be held to the same standards and each piece of scholarship should be evaluated on its own merits.

I would recommend that the author better align the framing of the paper with its empirical punchline.  In my reading the core criticism of this paper is that the Goffman (2009) has not provided sufficient information for someone to replicate or validate her results using existing survey data.  Although it may be less flashy, it seems more appropriate to frame the paper around how to evaluate social science research.  I’d advise the author to tone down the moralizing and discussion of ethics.  If one is to levy such a strong (and strongly worded) critique, one needs to root it firmly in established methods of social science.

That leads to the second, and perhaps even more fundamental, flaw.  If one is to levy such a strong (and strongly worded) critique, one needs to provide adequate empirical evidence to substantiate her/his claims.  Existing survey data from the ACS are not designed to address the kinds of questions Goffman engages in the paper and thus it is not appropriate for evaluating the reliability or validity of her survey research.  Numerous studies have established that large scale surveys like the ACS under-enumerate black men living in cities.  They fall into the “hard-to-reach” population that evade survey takers and census enumerators.  Survey researchers widely acknowledge this problem and Goffman’s research, rather than resolving the issue, raises important questions about the extent to which the criminal justice system may contribute to difficulties for conventional social science research data collection methods.  Perhaps the author can adopt a different, more scholarly, less authoritative, approach and turn the inconsistencies between her/his findings with the ACS and Goffman’s survey findings into a puzzle.  How can these two surveys generate such inconsistent findings?

Just like any survey, the ACS has many strengths.  But, the ACS is not well-suited to construct small area estimates of hard-to-reach populations.  The author’s attempt to do so is laudable but the simplicity of her/his analysis trivializes the difficultly in reaching some of the most disadvantaged segments of the population in conventional survey research.  It also trivializes one of the key insights of Goffman’s work and one that has been established previously and replicated by others: criminal justice contact fundamentally upends social relationships and living arrangements.

Furthermore, the ACS doesn’t ask any questions about criminal justice contact in a way that can help establish the validity of results for disadvantaged segments of the population who are most at-risk of criminal justice contact.  It is impossible to determine using the ACS how many men (or women) in the United States, Pennsylvania, or Philadelphia (or any neighborhood therein), have an outstanding warrant.  The ACS doesn’t ask about criminal justice contact, it doesn’t ask about outstanding warrants, and it isn’t designed to tap into the transient experiences of many people who have had criminal justice contact.  The author provides no data to evaluate the validity of Goffman’s claims about outstanding warrants.  Advancements in social science cannot be established from a “she said”, “he said” debate (e.g., FN 9-10).  That kind of argument risks a kind of intellectual policing that is antithetical to established standards of evaluating social science research.  That being said, someone should collect this evidence or at a minimum estimate, using indirect estimation methods, what fraction of different socio-demographic groups have outstanding warrants.

Although I believe that this paper is fundamentally flawed both in its framing and provision of evidence, I would like to encourage the author to replicate Goffman’s research.  That could involve an extended ethnography in a disadvantaged neighborhood in Philadelphia or another similar city.  That could also involve conducting a small area survey of a disadvantaged, predominantly black, neighborhood in a city with similar criminal justice policies and practices as Philadelphia in the period of Goffman’s study.  This kind of research is painstaking, time consuming, and sorely needed exactly because surveys like the ACS don’t – and can’t – adequately describe or explain social life among the most disadvantaged who are most likely to be missing from such surveys.

Reviewer: 2

I read this manuscript several times. It is more than a comment, it seems. It is 1) a critique of the description of survey methods in GASR and 2) a request for some action from ASR “to acknowledge errors when they occur.” The errors here have to do with Goffman’s description of survey methods in GASR, which the author describes in detail. This dual focus read as distracting at times. The manuscript would benefit from a more squarely focused critique of the description of survey methods in GASR.

Still, the author’s comment raises some valid concerns. The author’s primary concern is that the survey Goffman references in her 2009 ASR article is not described in enough detail to assess its accuracy or usefulness to a community of scholars. The author argues that some clarification is needed to properly understand the claims made in the book regarding the prevalence of men “on the run” and the degree to which the experience of the small group of men followed closely by Goffman is representative of most poor, Black men in segregated inner city communities. The author also cites a recent publication in which Goffman claims that the description provided in ASR is erroneous. If this is the case, it seems prudent for ASR to not only consider the author’s comments, but also to provide Goffman with an opportunity to correct the record.

I am not an expert in survey methods, but there are moments where the author’s interpretation of Goffman’s description seems overstated, which weakens the critique. For example, the author claims that Goffman is arguing that the entirety of the experience of the 6th Street crew is representative of the entire neighborhood, which is not necessarily what I gather from a close reading of GASR (although it may certainly be what has been taken up in popular discourse on the book). While there is overlap of the experience of being “on the run,” namely, your life is constrained in ways that it isn’t for those not on the run, it does appear that Goffman also uses the survey to describe a population that is distinct in important ways from the young men she followed on 6th street. The latter group has been “charged for more serious offenses like drugs and violent crimes,” she writes (this is the group that Sharkey argues might need to be “on the run”), while the larger group of men, whose information was gathered using survey data, were typically dealing with “more minor infractions”: “In the 6th Street neighborhood, a person was occasionally ‘on the run’ because he was a suspect in a shooting or robbery, but most people around 6th street had warrants out for far more minor infractions [emphasis mine].”

So, as I read it (I’ve also read the book), there are two groups: one “on the run” as a consequence of serious offenses and others “on the run” as a consequence of minor infractions. The consequence of being “on the run” is similar, even if the reason one is “on the run” varies.

The questions that remain are questions of prevalence and generalizability. The author asks: How many men in the neighborhood are “on the run” (for any reason)? How similar is this neighborhood to other neighborhoods? Answers to this question do rely on an accurate description of survey methods and data, as the author suggests.

This leads us to the most pressing and clearly argued question from the author: What is the survey population? Is it 1) “people around 6th Street” who also reside in the 6th Street neighborhood (of which, based on Goffman’s definition of in residence, are distributed across 217 distinct households in the neighborhood, however the neighborhood is defined e.g., 5 blocks or 6 blocks) or 2) the entirety of the neighborhood, which is made up of 217 households. It appears from the explanation from Goffman cited by the author that it is the former (“of the 217 households we interviewed,” which should probably read, of the 308 men we interviewed, all of whom reside in the neighborhood (based on Goffman’s definition of residence), 144 had a warrant…). Either way, the author makes a strong case for the need for clarification of this point.

The author goes on to explain the consequences of not accurately distinguishing among the two possibilities described above (or some other), but it seems like a good first step would be to request a clarification (the author could do this directly) and to allow more space than is allowed in a newspaper article to provide the type of explanation that could address the concerns of the author.

Is this the purpose of the comment? Or is the purpose of the comment merely to place a critique on record?  The primary objective is not entirely clear in the present manuscript.

The author’s comment is strong enough to encourage ASR to think through possibilities for correcting the record. As a critique of the survey methods, the comment would benefit from more focus. The comment could also do a better job of contextualizing or comparing/contrasting the use of survey methods in GASR with other ethnographic studies that incorporate survey methods (at the moment such references appear in footnotes).

Reviewer: 3

This comment exposes major errors in the survey methodology for Goffman’s article.  One major flaw is that the goffman article describes the survey as inclusive of all households in the neighborhood but later, in press interviews, discloses that it is not representative of all households in the neighborhood.  Another flaw that the author exposes is goffman’s data and methodological reporting not being up to par to sociological standards.  Finally, the author argues that the data from the survey does not match the ethnographic data.

Overall, I agree with the authors assertions that the survey component is flawed.  This is an important point because the article claims a large component of its substance from the survey instrument.  The survey helped goffman to bolster generalizability , and arguably, garner worthiness of publication in ASR.  If the massive errors in the survey had been exposed early on it is possible that ASR might have held back on publishing this article.

I am in agreement that ASR should correct the error highlighted on page 4 that the data set is not of the entire neighborhood but of random households/individuals given the survey in an informal way and that the sampling strategy should be described.  Goffman should aknowledge that this was a non-representative convenience sample, used for bolstering field observations.  It would follow then that the survey component of the ASR article would have to be rendered invalid and that only the field data in the article should be taken at face value.  Goffman should also be asked to provide a commentary on her survey methodology.

The author points out some compelling anomalies from the goffman survey and general social survey data and other representative data.  At best, goffman made serious mistakes with the survey and needs to be asked to show those mistakes and her survey methodology or she made up some of the data in the survey and appropriate action must be taken by ASR.  I agree with the authors final assessment, that the survey results be disregarded and the article be republished without mention of such results or with mention of the results albeit showing all of its errors and demonstrating the survey methodology.

My response

Regular readers can probably imagine my long, overblown, hyperventilating response to Reviewer 1, so I’ll just leave that to your imagination. On the bottom line, I disagree with the editors’ decision, but I can’t really blame them. Would it really be worth some number of pages in the journal, plus a reply and rejoinder, to hash this out? Within the constraints of the ASR format, maybe the pages aren’t worth it. And the result would not have been a definitive statement anyway, but rather just another debate among sociologists.

What else could they have done? Maybe it would have been better if the editors could simply append a note to the article advising readers that the survey is not accurately described, and cautioning against interpreting it as representative — with a link to the comment online somewhere explaining the problem. (Even so of course Goffman should have a chance to respond, and so on.)

It’s just wrong that now the editors acknowledge there is something wrong in their journal — although we seem to disagree about how serious the problem is — but no one is going to formally notify the future readers of the article. That seems like bad scholarly communication. I’ve said from the beginning that there’s no need for a high-volume conversation about this, or attack on anyone’s integrity or motives. There are important things in this research, and it’s also highly flawed. Acknowledge the errors — so they don’t compound — and move on.

This incident can help us learn lessons with implications up and down the publishing system. Here are a couple. At the level of social science research reporting: don’t publish survey results data without sufficient methodological documentation — let’s have the instrument and protocol, the code, and access to the data. At the system level of publishing, why do we still have journals with cost-defined page limits? Because for-profit publishing is more important than scholarly communication. The sooner we get out from under that 19th-century habit the better.


Filed under Me @ work

How we really can study divorce using just five questions and a giant sample

It would be great to know more about everything, but if you ask just these five questions of enough people, you can learn an awful lot about marriage and divorce.


First the questions, then some data. These are the question wordings from the 2013 American Community Survey (ACS).

1. What is Person X’s age?

We’ll just take the people who are ages 15 to 59, but that’s optional.

2. What is this person’s marital status?

Surprisingly, we don’t want to know if they’re divorced, just if they’re currently married (I include people are are separated and those who live apart from their spouses for other reasons). This is the denominator in your basic “refined divorce rate,” or divorces per 1000 married people.

3. In the past 12 months, did this person get divorced?

The number of people who got divorced in the last year is the numerator in your refined divorce rate. According to the ACS in 2013 (using population weights to scale the estimates up to the whole population), there were 127,571,069 married people, and 2,268,373 of them got divorced, so the refined divorce rate was 17.8 per 1,000 married people. When I analyze who got divorced, I’m going to mix all the currently-married and just-divorced people together, and then treat the divorces as an event, asking, who just got divorced?

4. In what year did this person last get married?

This is crucial for estimating divorce rates according to marriage duration. When you subtract this from the current year, that’s how long they are (or were) married. When you subtract the marriage duration from age, you get the age at marriage. (For example, a person who is 40 years old in 2013, who last got married in 2003, has a marriage duration of 10 years, and an age at marriage of 30.)

5. How many times has this person been married?

I use this to narrow our analysis down to women in their first marriages, which is a conventional way of simplifying the analysis, but that’s optional.


I restrict the analysis below to women, which is just a sexist convention for simplifying things (since men and women do things at different ages).*

So here are the 375,249 women in the 2013 ACS public use file, ages 16-59, who were in their first marriages, or just divorced from their first marriages, by their age at marriage and marriage duration. Add the two numbers together and you get their current age. The colors let you see the basic distribution (click to enlarge):

2011-2013 agemar figures.xlsx

The most populous cell on the table is 28-year-olds who got married three years ago, at age 25, with 1068 people. The least populous is 19-year-olds who got married at 15 (just 14 of them). The diagonal edge reflects my arbitrary cutoff at age 59.

Divorce results

Now, in each of these cells there are married people, and (in most of them) people who just got divorced. The ratio between those two frequencies is a divorce rate — one specific to the age at marriage and marriage duration. To make the next figure I used three years of ACS data (2011-2013) so the results would be smoother. (And then I smoothed it more by replacing each cell with an average of itself and the adjoining cells.) These are the divorce rates by age at marriage and years married (click to enlarge):

2011-2013 agemar figures.xlsx

The overall pattern here is more green, or lower divorce rates, to the right (longer duration of marriage) and down (older age at marriage). So the big red patch is the first 12 years for marriages begun before the woman was age 25. And after about 25 years of marriage it’s pretty much green, for low divorce rates. The high contrast at the bottom left implies an interesting high risk but steep decline in the first few years after marriage for these late marriages. This matrix adds nuance to the pattern I reported the other day, which featured a little bump up in divorce odds for people who married in their late thirties. From this figure it looks like marriages that start after the woman is about 35 might have less of a honeymoon period than those beginning about age 24-33.

To learn more, I go beyond those five great questions, and use a regression model (same as the other day), with a (collapsed) marriage-age–by–marriage-duration matrix. So these are predicted divorce rates per 1000, holding education, race/ethnicity, and nativity constant (click to enlarge)**:

2011-2013 agemar figures.xlsx

The controls cut down the late-thirties bump and isolate it mostly to the first year. This also shows that the punishing first year is an issue for all ages over 35. The late thirties just showed the bump because that group doesn’t have the big drop in divorce after the first year that the later years do. Interesting!


Here’s where the awesome data let us down. This data is very powerful. It’s the best contemporary big data set we have for analyzing divorce. It has taken us this far, but it can’t explain a pattern like this.

We can control for education, but that’s just the education level at the time of the most recent survey. We can’t know when she got her education relative to the dates of her marriage. Further, from the ACS we can’t tell how many children a person has had, with whom, and when — we only know about children who happen to be living in the household in 2013, so a 50-year-old could be childfree or have raised and released four kids already. And about couples, although we can say things about the other spouse from looking around in the household (such as his age, race, and income), if someone has divorced the spouse is gone and there is no information about that person (even their sex). So we can’t use that information to build a model of divorce predictors.

Here’s an example of what we can only hint at. Remarriages are more likely to end in divorce, for a variety of reasons, which is why we simplify these things by only looking at first marriages. But what about the spouse? Some of these women are married to men who’ve been married before. I can’t how much that contributes to their likelihood of divorce, but it almost certainly does. Think about the bump up in the divorce rate for women who got married in their late thirties. On the way from high divorce rates for women who marry early to low rates for women who marry late, the overall downward slope reflects increasing maturity and independence for women, but it’s running against the pressure of their increasingly complicated relationship situations. That late-thirties bump may have to do with the likelihood that their husbands have been married before. Here’s the circumstantial evidence:

2011-2013 agemar figures.xlsx

See that big jump from early-thirties to late-thirties? All of a sudden 37.5% of women marrying in their late-thirties are marrying men who are remarrying. That’s a substantial risk factor for divorce, and one I can’t account for in my analysis (because we don’t have spouse information for divorced women).

On method

Divorce is complicated and inherently longitudinal. Marriages arise out of specific contexts and thrive or decay in many different ways. Yesterday’s crucial influence may disappear today. So how can we say anything about divorce using a single, cross-sectional survey sample? The unsatisfying answer is that all analysis is partial. But these five questions give us a lot to go on, because knowing when a person got married allows us to develop a multidimensional image of the events, as I’ve demonstrated here.

But, you ask, what can we learn from, say, the divorce propensity of today’s 40-year-olds when we know that just last year a whole bunch of 39-year-olds divorced, skewing today’s sample? This is a real issue. And demography provides an answer that is at once partial and powerful: Simple, we use today’s 39-year-olds, too. In the purest form, this approach gives us the life table, in which one year’s mortality rates — at every age — lead to a projection of life expectancy. Another common application is the total fertility rate (watch the video!), which sums birth rates by age to project total births for a generation. In this case I have not produced a complete divorce life table (which I promised a while ago — it’s coming). But the approach is similar.

These are all synthetic cohort approaches (described nicely in the Week 6 lecture slides from this excellent Steven Ruggles course). In this case, the cohorts are age-at-marriage groups. Look at the table above and follow the row for, say, marriages that started at age 28, to see that synthetic cohort’s divorce experience from marriage until age 59. It’s neither a perfect depiction of the past, nor a foolproof prediction of the future. Rather, it tells us what’s happening now in cohort terms that are readily interpretable.


The ACS is the best thing we have for understanding the basic contours of divorce trends and patterns. Those five questions are invaluable.

* For this I also tossed the people who were reported to have married in the current year, because I wasn’t sure about the timing of their marriages and divorces, but I put them back in for the regressions.

** The codebook for my IPUMS data extraction is here, my Stata code is here. The heat-map model here isn’t in that code file, but this these are the commands (and the margins command took a very long time, so please don’t tell me there’s something wrong with it):

logistic divorce i.agemarc#i.mardurc i.degree i.race i.hispan i.citizen
margins i.agemarc#i.mardurc


Filed under Me @ work

The U.S. government asked 2 million Americans one simple question, and their answers will shock you

What is your age?

[SKIP TO THE END for a mystery-partly-solved addendum]

Normally when we teach demography we use population pyramids, which show how much of a population is found at each age. They’re great tools for visualizing population distributions and discussing projections of growth and decline. For example, consider this contrast between Niger and Japan, about as different as we get on earth these days (from this cool site):


It’s pretty easy to see the potential for population growth versus decline in these patterns. Finding good pyramids these days is easy, but it’s still good to make some yourself to get a feel for how they work.

So, thinking I might make a video lesson to follow up my blockbuster total fertility rate performance, I gathered some data from the U.S., using the 2013 American Community Survey (ACS) from IPUMS.org. I started with 10-year bins and the total population (not broken out by sex), which looks like this:


There’s the late Baby Boom, still bulging out at ages 50-59 (born 1954-1963), and their kids, ages 20-29. So far so good. But why not use single years of age and show something more precise? Here’s the same data, but showing single years of age:


That’s more fine-grained. Not as much as if you had data by months or days of birth, but still. Except, wait: is that just sample noise causing that ragged edge between 20 and about 70? The ACS sample is a few million people, with tens of thousands of people at each age (up age 75, at least), so you wouldn’t expect too much of that. No, it’s definitely age heaping, the tendency of people to skew their age reporting according to some collective cognitive scheme. The most common form is piling up on the ages ending with 0 and 5, but it could be anything. For example, some people might want to be 18, a socially significant milestone in this country. Here’s the same data, with suspect ages highlighted — 0’s and 5’s from 20 to 80, and 18:


You might think age heaping results from some old people not remembering how old they are. In the old days rounding off was more common at older ages. In 1900, for example, the most implausible number of people was found at age 60 — 1.6-times as many as you’d get by averaging the number of people at ages 59 and 61. Is that still the case? Here it is again, but with the red/green highlights just showing the difference between the number of people reported and the number you’d get by averaging the numbers just above and below:

totalsingleyearsflaggedhighlightProportionately, the 70-year-olds are most suspicious, at 10.8% more than you’d expect. But 40 is next, at 9.2%. And that green line shows extra 18-year-olds at 8.6% more than expected.

Unfortunately, it’s pretty hard to correct. Interestingly, the American Community Survey apparently asks for both an age and a birth date:


If you’re the kind of person who rounds off to 70, or promotes yourself to 18, it might not be worth the trouble to actually enter a fake birth date. I’m sure the Census Bureau does something with that, like correct obvious errors, but I don’t think they attempt to correct age-heaping in the ACS (the birth dates aren’t on the public use files). Anyway, we can see a little of the social process by looking at different groups of people.

Up till now I’ve been using the full public use data, with population weights, and including those people who left age blank or entered something implausible enough that the Census Bureau gave them an age (an “allocated” value, in survey parlance). For this I just used the unweighted counts of people whose answers were accepted “as written” (or typed, or spoken over the phone, depending on how it was administered to them). Here are the patterns for people who didn’t finish high school versus those with a bachelor’s degree or higher, highlighting the 5’s and 0’s (click to enlarge):


Clearly, the age heaping is more common among those with less education. Whether it’s really people forgetting their age, rounding up or down for aspirational reasons, or having trouble with the survey administration, I don’t know.

Is this bad? As much as we all hate inaccuracy, this isn’t so bad. Fortunately, demographers have methods for assessing the damage caused by humans and their survey-taking foibles. In this case we can use Whipple’s index. This measure (defined in this handy United Nations slideshow) takes the number of people whose alleged ages end in 0 or 5 and multiplies that by 5, then compares it to the total population. Normally people use ages 23 to 62 (inclusive), for an even 40 years. The amount by which people reporting ages 25, 30, 35, 40, 45, 50, 55, and 60 are more than one-fifth of the population ages 23-62, that’s your Whipple’s index. A score of 100 is perfect, and a score of 500 means everyone’s heaped. The U.N. considers scores under 105 to be “very accurate data.” The 2013 ACS, using the public use file and the weights, gives me a score of 104.3. (Those unweighted distributions by education yield scores of 104.0 for high school dropouts and 101.7 for college graduates.) In contrast, the Decennial Census in 2010 had a score of just 101.5 by my calculation (using table QT-P2 from Summary File 1). With the size of the ACS, this difference shouldn’t have to do with sampling variation. Rather, it’s something about the administration of the survey.

Why don’t they just tell us how old they really are? There must be a reason.

Two asides:

  • The age 18 pattern is interesting — I don’t find any research on desirable young-adult ages skewing sample surveys.
  • This is all very different from birth timing issues, such as the Chinese affinity for births in dragon years (every twelfth year: 1976, 1988…). I don’t see anything in the U.S. pattern that fits fluctuations in birth rates.

Mystery-partly-solved addendum

I focused one education above, but another explanation was staring me in the face. I said “it’s something about the administration of the survey,” but didn’t think to check for the form of survey people took. The public use files for ACS include an indicator of whether the household respondent took the survey through the mail (28%), on the web (39%), through a bureaucrat at the institution where they live (group quarters; 5%), or in an interview with a Census worker (28%). This last method, which is either a computer-assisted telephone interview (CATI) or computer-assisted personal interview (CAPI), is used when people don’t respond to the mailed survey.

It turns out that the entire Whipple problem in the 2013 ACS is due to the CATI/CAPI interviews. The age distributions for all of the other three methods have Whipple index scores below 100, while the CATI/CAPI folks clock in at a whopping 108.3. Here is that distribution, again using unweighted cases:


There they are, your Whipple participants. Who are they, and why does this happen? Here is the Bureau’s description of the survey data collection:

The data collection operation for housing units (HUs) consists of four modes: Internet, mail, telephone, and personal visit. For most HUs, the first phase includes a mailed request to respond via Internet, followed later by an option to complete a paper questionnaire and return it by mail. If no response is received by mail or Internet, the Census Bureau follows up with computer assisted telephone interviewing (CATI) when a telephone number is available. If the Census Bureau is unable to reach an occupant using CATI, or if the household refuses to participate, the address may be selected for computer-assisted personal interviewing (CAPI).

So the CATI/CAPI people are those who were either difficult to reach or were uncooperative when contacted. This group, incidentally, has low average education, as 63% have high school education or less (compared with 55% of the total) — which may explain the association with education. Maybe they have less accurate recall, or maybe they are less cooperative, which makes sense if they didn’t want to do the survey in the first place (which they are legally mandated — i.e., coerced — to do). So when their date of birth and age conflict, and the Census worker tries to elicit a correction, maybe all hell breaks lose in the interview and they can’t work it out. Or maybe the CATI/CAPI households have more people who don’t know each other’s exact ages (one person answers for the household). I don’t know. But this narrows it down considerably.


Filed under Research reports

On Goffman’s survey

Survey methods.

Survey methods.

Jesse Singal at New York Magazine‘s Science of Us has a piece in which he tracks down and interviews a number of Alice Goffman’s respondents. This settles the question — which never should have been a real question — about whether she actually did all that deeply embedded ethnography in Philadelphia. It leaves completely unresolved, however, the issue of the errors and possible errors in the research. This reaffirms for me the conclusion in my original review that we should take the volume down in this discussion, identify errors in the research without trying to attack Goffman personally or delegitimize her career — and then learn from the affair ways that we can improve sociology (for example, by requiring that winners of the American Sociological Association dissertation award make their work publicly available).

That said, I want to comment on a couple of issues raised in Singal’s piece, and share my draft of a formal comment on the survey research Goffman reported in American Sociological Review.

First, I want to distance myself from the description by Singal of “lawyers and journalists and rival academics who all stand to benefit in various ways if they can show that On the Run doesn’t fully hold up.” I don’t see how I (or any other sociologists) benefit if Goffman’s research does not hold up. In fact, although some people think this is worth pursuing, I am also annoying some friends and colleagues by doing this.

More importantly, although it’s a small part of the article, Singal did ask Goffman about the critique of her survey, and her response (as he paraphrased it, anyway) was not satisfying to me:

Philip Cohen, a sociologist at the University of Maryland, published a blog post in which he puzzles over the strange results of a door-to-door survey Goffman says she conducted with Chuck in 2007 in On the Run. The results are implausible in a number of ways. But Goffman explained to me that this wasn’t a regular survey; it was an ethnographic survey, which involves different sampling methods and different definitions of who is and isn’t in a household. The whole point, she said, was to capture people who are rendered invisible by traditional survey methods. (Goffman said an error in the American Sociological Review paper that became On the Run is causing some of the confusion — a reference to “the 217 households that make up the 6th Street neighborhood” that should have read “the 217 households that we interviewed … ” [emphasis mine]. It’s a fix that addresses some of Cohen’s concerns, like an implied and very unlikely 100 percent response rate, but not all of them.) “I should have included a second appendix on the survey in the book,” said Goffman. “If I could do it over again, I would.”

My responses are several. First, the error of describing the 217 households as the whole neighborhood, as well as the error in the book of saying she interviewed all 308 men (when in the ASR article she reports some unknown number were absent), both go in the direction of inflating the value and quality of the survey. Maybe they are random errors, but they didn’t have a random effect.

Second, I don’t see a difference between a “regular survey” and an “ethnographic survey.” There are different survey techniques for different applications, and the techniques used determine the data and conclusions that follow. For example, in the ASR article Goffman uses the survey (rather than Census data) to report the racial composition of the neighborhood, which is not something you can do with a convenience sample, regardless of whether you are engaged in an ethnography or not.

Finally, there are no people “rendered invisible by traditional survey methods” (presumably Singal’s phrase). There are surveys that are better or worse at including people in different situations. There are “traditional” surveys — of varying quality — of homeless people, prisoners, rape victims, and illiterate peasants. I don’t know what an “ethnographic survey” is, but I don’t see why it shouldn’t include a sampling strategy, a response rate, a survey instrument, a data sharing arrangement, and thorough documentation of procedures. That second methodological appendix can be published at any time.

ASR Comment (revised June 22)

I wrote up my relatively narrow, but serious, concerns about the survey, and posted them on my website here.

It strikes me that Goffman’s book (either the University of Chicago Press version or the trade book version) may not be subject to the same level of scrutiny that her article in ASR should have been. In fact, presumably, the book publishers took her publication in ASR as evidence of the work’s quality. And their interests are different from those of a scientific journal run by an academic society. If ASR is going to play that gatekeeping role, and it should, then ASR (and by extension ASA) should take responsibility in print for errors in its publications.


Filed under Research reports