Tag Archives: survey methods

Black men raping White women: BJS’s Table 42 problem

I’ve been putting off writing this post because I wanted to do more justice both to the history of the Black-men-raping-White-women charge and the survey methods questions. Instead I’m just going to lay this here and hope it helps someone who is more engaged than I am at the moment. I’m sorry this post isn’t higher quality.

Obviously, this post includes extremely racist and misogynist content, which I am showing you to explain why it’s bad.

This is about this very racist meme, which is extremely popular among extreme racists.

tumblr_n2i5w0kygo1qaeo2oo1_500

The modern racist uses statistics, data, and even math. They use citations. And I think it takes actually engaging with this stuff to stop it (this is untested, though, as I have no real evidence that facts help). That means anti-racists need to learn some demography and survey methods, and practice them in public. I was prompted to finally write on this by a David Duke video streamed on Facebook, in which he used exaggerated versions of these numbers, and the good Samaritans arguing with him did not really know how to respond.

For completely inadequate context: For a very long time, Black men raping White women has been White supremacists’ single favorite thing. This was the most common justification for lynching, and for many of the legal executions of Black men throughout the 20th century. From 1930 to 1994 there were 455 people executed for rape in the U.S., and 89% of them were Black (from the 1996 Statistical Abstract):

1996statabs-executions

For some people, this is all they need to know about how bad the problem of Blacks raping Whites is. For better informed people, it’s the basis for a great lesson in how the actions of the justice system are not good measures of the crimes it’s supposed to address.

Good data gone wrong

Which is one reason the government collects the National Crime Victimization Survey (NCVS), a large sample survey of about 90,000 households with 160,000 people. In it they ask about crimes against the people surveyed, and the answers the survey yields are usually pretty different from what’s in the crime report statistics – and even further from the statistics on things like convictions and incarceration. It’s supposed to be a survey of crime as experienced, not as reported or punished.

It’s an important survey that yields a lot of good information. But in this case the Bureau of Justice Statistics is doing a serious disservice in the way they are reporting the results, and they should do something about it. I hope they will consider it.

Like many surveys, the NCVS is weighted to produce estimates that are supposed to reflect the general population. In a nutshell, that means, for example, that they treat each of the 158,000 people (over age 12) covered in 2014 as about 1,700 people. So if one person said, “I was raped,” they would say, “1700 people in the US say they were raped.” This is how sampling works. In fact, they tweak it much more than that, to make the numbers add up according to population distributions of variables like age, sex, race, and region – and non-response, so that if a certain group (say Black women) has a low response rate, their responses get goosed even more. This is reasonable and good, but it requires care in reporting to the general public.

So, how is the Bureau of Justice Statistics’ (BJS) reporting method contributing to the racist meme above? The racists love to cite Table 42 of this report, which last came out for the 2008 survey. This is the source for David Duke’s rant, and the many, many memes about this. The results of Google image search gives you a sense of how many websites are distributing this:

imagesearch

Here is Table 42, with my explanation below:

table42-highlighted

What this shows is that, based on their sample, BJS extrapolates an estimate of 117,640 White women who say they were sexually assaulted, or threatened with sexual assault, in 2008 (in the red box). Of those, 16.4% described their assailant as Black (the blue highlight). That works out to 19,293 White women sexually assaulted or threatened by Black men in one year – White supremacists do math. In the 2005 version of the table these numbers were 111,490 and 33.6%, for 37,460 White women sexually assaulted or threatened by Black men, or:

everyday

Now, go back to the structure of the survey. If each respondent in the survey counts for about 1,700 people, then the survey in 2008 would have found 69 White women who were sexually assaulted or threatened, 11 of whom said their assailant was Black (117,640/1,700). Actually, though, we know it was less than 11, because the asterisk on the table takes you to the footnote below which says it was based on 10 or fewer sample cases. In comparison, the survey may have found 27 Black women who said they were sexually assaulted or threatened (46,580/1,700), none of whom said their attacker was White, which is why the second blue box shows 0.0. However, it actually looks like the weights are bigger for Black women, because the figure for the percentage assaulted or threatened by Black attackers, 74.8%, has the asterisk that indicates 10 or fewer cases. If there were 27 Black women in this category, then 74.8% of them would be 20. So this whole Black women victim sample might be as little as 13, with bigger weights applied (because, say, Black women had a lower response rate). If in fact Black women are just as likely to be attacked or assaulted by White men as the reverse, 16%, you might only expect 2 of those 13 to be White, and so finding a sample 0 is not very surprising. The actual weighting scheme is clearly much more complicated, and I don’t know the unweighted counts, as they are not reported here (and I didn’t analyze the individual-level data).

I can’t believe we’re talking about this. The most important bottom line is that the BJS should not report extrapolations to the whole population from samples this small. These population numbers should not be on this table. At best these numbers are estimated with very large standard errors. (Using a standard confident interval calculator, that 16% of White women, based on a sample of 69, yields a confidence interval of +/- 9%.) It’s irresponsible, and it’s inadvertently (I assume) feeding White supremacist propaganda.

Rape and sexual assault are very disturbingly common, although not as common as they were a few decades ago, by conventional measures. But it’s a big country, and I don’t doubt lots of Black men sexual assault or threaten White women, and that White men sexually assault or threaten Black women a lot, too – certainly more than never. If we knew the true numbers, they would be bad. But we don’t.

A couple more issues to consider. Most sexual assault happens within relationships, and Black women have interracial relationships at very low rates. In round numbers (based on marriages), 2% of White women are with Black men, and 5% of Black women are with White men, which – because of population sizes – means there are more than twice as many couples with Black-man/White-woman than the reverse. At very small sample sizes, this matters a lot. But we would expect there to be more Black-White rape than the reverse based on this pattern alone. Consider further that the NCVS is a household sample, which means that if any Black women are sexually assaulted by White men in prison, it wouldn’t be included. Based on a 2011-2012 survey of prison and jail inmates, 3,500 women per year are the victim of staff sexual misconduct, and Black women inmates were about 50% more likely to report this than White women. So I’m guessing the true number of Black women sexually assaulted by White men is somewhat greater than zero, and that’s just in prisons and jails.

The BJS seems to have stopped releasing this form of the report, with Table 42, maybe because of this kind of problem, which would be great. In that case they just need to put out a statement clarifying and correcting the old reports – which they should still do, because they are out there. (The more recent reports are skimpier, and don’t get into this much detail [e.g., 2014] – and their custom table tool doesn’t allow you to specify the perceived race of the offender).

So, next time you’re arguing with David Duke, the simplest response to this is that the numbers he’s talking about are based on very small samples, and the asterisk means he shouldn’t use the number. The racists won’t take your advice, but it’s good for everyone else to know.

16 Comments

Filed under In the news

Comment on Goffman’s survey, American Sociological Review rejection edition

7241915012_29fc2a7211_k

Peer Review, by Gary Night. https://flic.kr/p/c2WH2E

Background:

  • I reviewed Alice Goffman’s book, On The Run.
  • I complained that her dissertation was not made public, despite being awarded the American Sociological Association’s dissertation prize. I proposed a rule change for the association, requiring that the winning dissertation be “publicly available through a suitable academic repository by the time of the ASA meeting at which the award is granted.” (The rule change is moving through the process.)
  • When her dissertation was released, I complained about the rationale for the delay.
  • My critique of the survey that was part of her research grew into a formal comment (PDF) submitted to American Sociological Review.

In this post I don’t have anything to add about Alice Goffman’s work. This is about what we can learn from this and other incidents to improve our social science and its contribution to the wider social discourse. As Goffman’s TED Talk passed 1 million views, we have had good conversations about replicability and transparency in research, and about ethics in ethnography. And of course about the impact of criminal justice system and over-policing on African Americans, the intended target of her work. This post is about how we deal with errors in our scholarly publishing.

My comment was rejected by the American Sociological Review.

You might not realize this, but unlike many scientific journals, except for “errata” notices, which are for typos and editing errors, ASR has no normal way of acknowledging or correcting errors in research. To my knowledge ASR has never retracted an article or published an editor’s note explaining how an article, or part of an article, is wrong. Instead, they publish Comments (and Replies). The Comments are submitted and reviewed anonymously by peer reviewers just like an article, and then if the Comment is accepted the original author responds (maybe followed by a rejoinder). It’s a cumbersome and often combative process, often mixing theoretical with methodological critiques. And it creates a very high hurdle to leap, and a long delay, before the journal can correct itself.

In this post I’ll briefly summarize my comment, then post the ASR editors’ decision letter and reviews.

Comment: Survey and ethnography

I wrote the comment about Goffman’s 2009 ASR article for accountability. The article turned out to be the first step toward a major book, so ASR played a gatekeeping role for a much wider reading audience, which is great. But then it should take responsibility to notify readers about errors in its pages.

My critique boiled down to these points:

  • The article describes the survey as including all households in the neighborhood, which is not the case, and used statistics from the survey to describe the neighborhood (its racial composition and rates of government assistance), which is not justified.
  • The survey includes some number (probably a lot) of men who did not live in the neighborhood, but who were described as “in residence” in the article, despite being “absent because they were in the military, at job training programs (like JobCorp), or away in jail, prison, drug rehab centers, or halfway houses.” There is no information about how or whether such men were contacted, or how the information about them was obtained (or how many in her sample were not actually “in residence”).
  • The survey results are incongruous with the description of the neighborhood in the text, and — when compared with data from other sources — describe an apparently anomalous social setting. She reported finding more than twice as many men (ages 18-30) per household as the Census Bureau reports from their American Community Survey of Black neighborhoods in Philadelphia (1.42 versus .60 per household). She reported that 39% of these men had warrants for violating probation or parole in the prior three years. Using some numbers from other sources on violation rates, that translates into between 65% and 79% of the young men in the neighborhood being on probation or parole — very high for a neighborhood described as “nice and quiet” and not “particularly dangerous or crime-ridden.”
  • None of this can be thoroughly evaluated because the reporting of the data and methodology for the survey were inadequate to replicate or even understand what was reported.

You can read my comment here in PDF. Since I aired it out on this blog before submitting it, making it about as anonymous as a lot of other peer-review submissions, I see no reason to shroud the process any further. The editors’ letter I received is signed by the current editors — Omar Lizardo, Rory McVeigh, and Sarah Mustillo — although I submitted the piece before they officially took over (the editors at the time of my submission were Larry W. Isaac and Holly J. McCammon). The reviewers are of course anonymous. My final comment is at the end.

ASR letter and reviews

Editors’ letter:

25-Aug-2015

Dear Prof. Cohen:

The reviews are in on your manuscript, “Survey and ethnography: Comment on Goffman’s ‘On the Run’.” After careful reading and consideration, we have decided not to accept your manuscript for publication in American Sociological Review (ASR).  Our decision is based on the reviewers’ comments, our reading of the manuscript, an overall assessment of the significance of the contribution of the manuscript to sociological knowledge, and an estimate of the likelihood of a successful revision.

As you will see, there was a range of opinions among the reviewers of your submission.  Reviewer 1 feels strongly that the comment should not be published, reviewer 3 feels strongly that it should be published, and reviewer 2 falls in between.  That reviewer sees merit in the criticisms but also suggests that the author’s arguments seem overstated in places and stray at times from discussion that is directly relevant to a critique of the original article’s alleged shortcomings.

As editors of the journal, we feel it is essential that we focus on the comment’s critique of the original ASR article (which was published in 2009), rather than the recently published book or controversy and debate that is not directly related to the submitted comment.  We must consider not only the merits of the arguments and evidence in the submitted comment, but also whether the comment is important enough to occupy space that could otherwise be used for publishing new research.  With these factors in mind, we feel that the main result that would come from publishing the comment would be that valuable space in the journal would be devoted to making a point that Goffman has already acknowledged elsewhere (that she did not employ probability sampling).

As the author of the comment acknowledges, there is actually very little discussion of, or use of, the survey data in Goffman’s article.   We feel that the crux of the argument (about the survey) rests on a single sentence found on page 342 of the original article:  “The five blocks known as 6th street are 93 percent Black, according to a survey of residents that Chuck and I conducted in 2007.”  The comment author is interpreting that to mean that Goffman is claiming she conducted scientific probability sampling (with all households in the defined space as the sampling frame).  It is important to note here that Goffman does not actually make that claim in the article.  It is something that some readers might infer.  But we are quite sure that many other readers simply assumed that this is based on nonprobability sampling or convenience sampling.  Goffman speaks of it as a survey she conducted when she was an undergraduate student with one of the young men from the neighborhood.  Given that description of the survey, we expect many readers assumed it was a convenience sample rather than a well-designed probability sample.  Would it have been better if Goffman had made that more explicit in the original article?  Yes.

In hindsight, it seems safe to say that most scholars (probably including Goffman) would say that the brief mentions of the survey data should have been excluded from the article.  In part, this is because the reported survey findings play such a minor role in the contribution that the paper aims to make.

We truly appreciate the opportunity to review your manuscript, and hope that you will continue to think of ASR for your future research.

Sincerely,

Omar Lizardo, Rory McVeigh, and Sarah Mustillo

Editors, American Sociological Review

Reviewer: 1

This paper seeks to provide a critique of the survey data employed in Goffman (2009).  Drawing on evidence from the American Community Survey, the author argues that data presented in Goffman (2009) about the community in which she conducted her ethnography is suspect.  The author draws attention to remarkably high numbers of men living in households (compared with estimates derived from ACS data) and what s/he calls an “extremely high number” of outstanding warrants reported by Goffman.  S/he raises the concern that Goffman (2009) did not provide readers with enough information about the survey and its methodology for them to independently evaluate its merits and thus, ultimately, calls into question the generalizability of Goffman’s survey results.

This paper joins a chorus of critiques of Goffman’s (2009) research and subsequent book.  This critique is novel in that the critique is focused on the survey aspect of the research rather than on Goffman’s persona or an expressed disbelief of or distaste for her research findings (although that could certainly be an implication of this critique).

I will not comment on the reliability, validity or generalizability of Goffman’s (2009) evidence, but I believe this paper is fundamentally flawed.  There are two key problems with this paper.  First the core argument of the paper (critique) is inadequately situated in relation to previous research and theory.  Second, the argument is insufficiently supported by empirical evidence.

The framing of the paper is not aligned with the core empirical aims of the paper.  I’m not exactly sure what to recommend here because it seems as if this is written for a more general audience and not a sociological one.  It strikes me as unusual, if not odd, to reference the popularity of a paper as a motivation for its critique.  Whether or not Goffman’s work is widely cited in sociological or other circles is irrelevant for this or any other critique of the work.  All social science research should be held to the same standards and each piece of scholarship should be evaluated on its own merits.

I would recommend that the author better align the framing of the paper with its empirical punchline.  In my reading the core criticism of this paper is that the Goffman (2009) has not provided sufficient information for someone to replicate or validate her results using existing survey data.  Although it may be less flashy, it seems more appropriate to frame the paper around how to evaluate social science research.  I’d advise the author to tone down the moralizing and discussion of ethics.  If one is to levy such a strong (and strongly worded) critique, one needs to root it firmly in established methods of social science.

That leads to the second, and perhaps even more fundamental, flaw.  If one is to levy such a strong (and strongly worded) critique, one needs to provide adequate empirical evidence to substantiate her/his claims.  Existing survey data from the ACS are not designed to address the kinds of questions Goffman engages in the paper and thus it is not appropriate for evaluating the reliability or validity of her survey research.  Numerous studies have established that large scale surveys like the ACS under-enumerate black men living in cities.  They fall into the “hard-to-reach” population that evade survey takers and census enumerators.  Survey researchers widely acknowledge this problem and Goffman’s research, rather than resolving the issue, raises important questions about the extent to which the criminal justice system may contribute to difficulties for conventional social science research data collection methods.  Perhaps the author can adopt a different, more scholarly, less authoritative, approach and turn the inconsistencies between her/his findings with the ACS and Goffman’s survey findings into a puzzle.  How can these two surveys generate such inconsistent findings?

Just like any survey, the ACS has many strengths.  But, the ACS is not well-suited to construct small area estimates of hard-to-reach populations.  The author’s attempt to do so is laudable but the simplicity of her/his analysis trivializes the difficultly in reaching some of the most disadvantaged segments of the population in conventional survey research.  It also trivializes one of the key insights of Goffman’s work and one that has been established previously and replicated by others: criminal justice contact fundamentally upends social relationships and living arrangements.

Furthermore, the ACS doesn’t ask any questions about criminal justice contact in a way that can help establish the validity of results for disadvantaged segments of the population who are most at-risk of criminal justice contact.  It is impossible to determine using the ACS how many men (or women) in the United States, Pennsylvania, or Philadelphia (or any neighborhood therein), have an outstanding warrant.  The ACS doesn’t ask about criminal justice contact, it doesn’t ask about outstanding warrants, and it isn’t designed to tap into the transient experiences of many people who have had criminal justice contact.  The author provides no data to evaluate the validity of Goffman’s claims about outstanding warrants.  Advancements in social science cannot be established from a “she said”, “he said” debate (e.g., FN 9-10).  That kind of argument risks a kind of intellectual policing that is antithetical to established standards of evaluating social science research.  That being said, someone should collect this evidence or at a minimum estimate, using indirect estimation methods, what fraction of different socio-demographic groups have outstanding warrants.

Although I believe that this paper is fundamentally flawed both in its framing and provision of evidence, I would like to encourage the author to replicate Goffman’s research.  That could involve an extended ethnography in a disadvantaged neighborhood in Philadelphia or another similar city.  That could also involve conducting a small area survey of a disadvantaged, predominantly black, neighborhood in a city with similar criminal justice policies and practices as Philadelphia in the period of Goffman’s study.  This kind of research is painstaking, time consuming, and sorely needed exactly because surveys like the ACS don’t – and can’t – adequately describe or explain social life among the most disadvantaged who are most likely to be missing from such surveys.

Reviewer: 2

I read this manuscript several times. It is more than a comment, it seems. It is 1) a critique of the description of survey methods in GASR and 2) a request for some action from ASR “to acknowledge errors when they occur.” The errors here have to do with Goffman’s description of survey methods in GASR, which the author describes in detail. This dual focus read as distracting at times. The manuscript would benefit from a more squarely focused critique of the description of survey methods in GASR.

Still, the author’s comment raises some valid concerns. The author’s primary concern is that the survey Goffman references in her 2009 ASR article is not described in enough detail to assess its accuracy or usefulness to a community of scholars. The author argues that some clarification is needed to properly understand the claims made in the book regarding the prevalence of men “on the run” and the degree to which the experience of the small group of men followed closely by Goffman is representative of most poor, Black men in segregated inner city communities. The author also cites a recent publication in which Goffman claims that the description provided in ASR is erroneous. If this is the case, it seems prudent for ASR to not only consider the author’s comments, but also to provide Goffman with an opportunity to correct the record.

I am not an expert in survey methods, but there are moments where the author’s interpretation of Goffman’s description seems overstated, which weakens the critique. For example, the author claims that Goffman is arguing that the entirety of the experience of the 6th Street crew is representative of the entire neighborhood, which is not necessarily what I gather from a close reading of GASR (although it may certainly be what has been taken up in popular discourse on the book). While there is overlap of the experience of being “on the run,” namely, your life is constrained in ways that it isn’t for those not on the run, it does appear that Goffman also uses the survey to describe a population that is distinct in important ways from the young men she followed on 6th street. The latter group has been “charged for more serious offenses like drugs and violent crimes,” she writes (this is the group that Sharkey argues might need to be “on the run”), while the larger group of men, whose information was gathered using survey data, were typically dealing with “more minor infractions”: “In the 6th Street neighborhood, a person was occasionally ‘on the run’ because he was a suspect in a shooting or robbery, but most people around 6th street had warrants out for far more minor infractions [emphasis mine].”

So, as I read it (I’ve also read the book), there are two groups: one “on the run” as a consequence of serious offenses and others “on the run” as a consequence of minor infractions. The consequence of being “on the run” is similar, even if the reason one is “on the run” varies.

The questions that remain are questions of prevalence and generalizability. The author asks: How many men in the neighborhood are “on the run” (for any reason)? How similar is this neighborhood to other neighborhoods? Answers to this question do rely on an accurate description of survey methods and data, as the author suggests.

This leads us to the most pressing and clearly argued question from the author: What is the survey population? Is it 1) “people around 6th Street” who also reside in the 6th Street neighborhood (of which, based on Goffman’s definition of in residence, are distributed across 217 distinct households in the neighborhood, however the neighborhood is defined e.g., 5 blocks or 6 blocks) or 2) the entirety of the neighborhood, which is made up of 217 households. It appears from the explanation from Goffman cited by the author that it is the former (“of the 217 households we interviewed,” which should probably read, of the 308 men we interviewed, all of whom reside in the neighborhood (based on Goffman’s definition of residence), 144 had a warrant…). Either way, the author makes a strong case for the need for clarification of this point.

The author goes on to explain the consequences of not accurately distinguishing among the two possibilities described above (or some other), but it seems like a good first step would be to request a clarification (the author could do this directly) and to allow more space than is allowed in a newspaper article to provide the type of explanation that could address the concerns of the author.

Is this the purpose of the comment? Or is the purpose of the comment merely to place a critique on record?  The primary objective is not entirely clear in the present manuscript.

The author’s comment is strong enough to encourage ASR to think through possibilities for correcting the record. As a critique of the survey methods, the comment would benefit from more focus. The comment could also do a better job of contextualizing or comparing/contrasting the use of survey methods in GASR with other ethnographic studies that incorporate survey methods (at the moment such references appear in footnotes).

Reviewer: 3

This comment exposes major errors in the survey methodology for Goffman’s article.  One major flaw is that the goffman article describes the survey as inclusive of all households in the neighborhood but later, in press interviews, discloses that it is not representative of all households in the neighborhood.  Another flaw that the author exposes is goffman’s data and methodological reporting not being up to par to sociological standards.  Finally, the author argues that the data from the survey does not match the ethnographic data.

Overall, I agree with the authors assertions that the survey component is flawed.  This is an important point because the article claims a large component of its substance from the survey instrument.  The survey helped goffman to bolster generalizability , and arguably, garner worthiness of publication in ASR.  If the massive errors in the survey had been exposed early on it is possible that ASR might have held back on publishing this article.

I am in agreement that ASR should correct the error highlighted on page 4 that the data set is not of the entire neighborhood but of random households/individuals given the survey in an informal way and that the sampling strategy should be described.  Goffman should aknowledge that this was a non-representative convenience sample, used for bolstering field observations.  It would follow then that the survey component of the ASR article would have to be rendered invalid and that only the field data in the article should be taken at face value.  Goffman should also be asked to provide a commentary on her survey methodology.

The author points out some compelling anomalies from the goffman survey and general social survey data and other representative data.  At best, goffman made serious mistakes with the survey and needs to be asked to show those mistakes and her survey methodology or she made up some of the data in the survey and appropriate action must be taken by ASR.  I agree with the authors final assessment, that the survey results be disregarded and the article be republished without mention of such results or with mention of the results albeit showing all of its errors and demonstrating the survey methodology.

My response

Regular readers can probably imagine my long, overblown, hyperventilating response to Reviewer 1, so I’ll just leave that to your imagination. On the bottom line, I disagree with the editors’ decision, but I can’t really blame them. Would it really be worth some number of pages in the journal, plus a reply and rejoinder, to hash this out? Within the constraints of the ASR format, maybe the pages aren’t worth it. And the result would not have been a definitive statement anyway, but rather just another debate among sociologists.

What else could they have done? Maybe it would have been better if the editors could simply append a note to the article advising readers that the survey is not accurately described, and cautioning against interpreting it as representative — with a link to the comment online somewhere explaining the problem. (Even so of course Goffman should have a chance to respond, and so on.)

It’s just wrong that now the editors acknowledge there is something wrong in their journal — although we seem to disagree about how serious the problem is — but no one is going to formally notify the future readers of the article. That seems like bad scholarly communication. I’ve said from the beginning that there’s no need for a high-volume conversation about this, or attack on anyone’s integrity or motives. There are important things in this research, and it’s also highly flawed. Acknowledge the errors — so they don’t compound — and move on.

This incident can help us learn lessons with implications up and down the publishing system. Here are a couple. At the level of social science research reporting: don’t publish survey results data without sufficient methodological documentation — let’s have the instrument and protocol, the code, and access to the data. At the system level of publishing, why do we still have journals with cost-defined page limits? Because for-profit publishing is more important than scholarly communication. The sooner we get out from under that 19th-century habit the better.

12 Comments

Filed under Me @ work

How we really can study divorce using just five questions and a giant sample

It would be great to know more about everything, but if you ask just these five questions of enough people, you can learn an awful lot about marriage and divorce.

Questions

First the questions, then some data. These are the question wordings from the 2013 American Community Survey (ACS).

1. What is Person X’s age?

We’ll just take the people who are ages 15 to 59, but that’s optional.

2. What is this person’s marital status?

Surprisingly, we don’t want to know if they’re divorced, just if they’re currently married (I include people are are separated and those who live apart from their spouses for other reasons). This is the denominator in your basic “refined divorce rate,” or divorces per 1000 married people.

3. In the past 12 months, did this person get divorced?

The number of people who got divorced in the last year is the numerator in your refined divorce rate. According to the ACS in 2013 (using population weights to scale the estimates up to the whole population), there were 127,571,069 married people, and 2,268,373 of them got divorced, so the refined divorce rate was 17.8 per 1,000 married people. When I analyze who got divorced, I’m going to mix all the currently-married and just-divorced people together, and then treat the divorces as an event, asking, who just got divorced?

4. In what year did this person last get married?

This is crucial for estimating divorce rates according to marriage duration. When you subtract this from the current year, that’s how long they are (or were) married. When you subtract the marriage duration from age, you get the age at marriage. (For example, a person who is 40 years old in 2013, who last got married in 2003, has a marriage duration of 10 years, and an age at marriage of 30.)

5. How many times has this person been married?

I use this to narrow our analysis down to women in their first marriages, which is a conventional way of simplifying the analysis, but that’s optional.

Data

I restrict the analysis below to women, which is just a sexist convention for simplifying things (since men and women do things at different ages).*

So here are the 375,249 women in the 2013 ACS public use file, ages 16-59, who were in their first marriages, or just divorced from their first marriages, by their age at marriage and marriage duration. Add the two numbers together and you get their current age. The colors let you see the basic distribution (click to enlarge):

2011-2013 agemar figures.xlsx

The most populous cell on the table is 28-year-olds who got married three years ago, at age 25, with 1068 people. The least populous is 19-year-olds who got married at 15 (just 14 of them). The diagonal edge reflects my arbitrary cutoff at age 59.

Divorce results

Now, in each of these cells there are married people, and (in most of them) people who just got divorced. The ratio between those two frequencies is a divorce rate — one specific to the age at marriage and marriage duration. To make the next figure I used three years of ACS data (2011-2013) so the results would be smoother. (And then I smoothed it more by replacing each cell with an average of itself and the adjoining cells.) These are the divorce rates by age at marriage and years married (click to enlarge):

2011-2013 agemar figures.xlsx

The overall pattern here is more green, or lower divorce rates, to the right (longer duration of marriage) and down (older age at marriage). So the big red patch is the first 12 years for marriages begun before the woman was age 25. And after about 25 years of marriage it’s pretty much green, for low divorce rates. The high contrast at the bottom left implies an interesting high risk but steep decline in the first few years after marriage for these late marriages. This matrix adds nuance to the pattern I reported the other day, which featured a little bump up in divorce odds for people who married in their late thirties. From this figure it looks like marriages that start after the woman is about 35 might have less of a honeymoon period than those beginning about age 24-33.

To learn more, I go beyond those five great questions, and use a regression model (same as the other day), with a (collapsed) marriage-age–by–marriage-duration matrix. So these are predicted divorce rates per 1000, holding education, race/ethnicity, and nativity constant (click to enlarge)**:

2011-2013 agemar figures.xlsx

The controls cut down the late-thirties bump and isolate it mostly to the first year. This also shows that the punishing first year is an issue for all ages over 35. The late thirties just showed the bump because that group doesn’t have the big drop in divorce after the first year that the later years do. Interesting!

Sigh

Here’s where the awesome data let us down. This data is very powerful. It’s the best contemporary big data set we have for analyzing divorce. It has taken us this far, but it can’t explain a pattern like this.

We can control for education, but that’s just the education level at the time of the most recent survey. We can’t know when she got her education relative to the dates of her marriage. Further, from the ACS we can’t tell how many children a person has had, with whom, and when — we only know about children who happen to be living in the household in 2013, so a 50-year-old could be childfree or have raised and released four kids already. And about couples, although we can say things about the other spouse from looking around in the household (such as his age, race, and income), if someone has divorced the spouse is gone and there is no information about that person (even their sex). So we can’t use that information to build a model of divorce predictors.

Here’s an example of what we can only hint at. Remarriages are more likely to end in divorce, for a variety of reasons, which is why we simplify these things by only looking at first marriages. But what about the spouse? Some of these women are married to men who’ve been married before. I can’t how much that contributes to their likelihood of divorce, but it almost certainly does. Think about the bump up in the divorce rate for women who got married in their late thirties. On the way from high divorce rates for women who marry early to low rates for women who marry late, the overall downward slope reflects increasing maturity and independence for women, but it’s running against the pressure of their increasingly complicated relationship situations. That late-thirties bump may have to do with the likelihood that their husbands have been married before. Here’s the circumstantial evidence:

2011-2013 agemar figures.xlsx

See that big jump from early-thirties to late-thirties? All of a sudden 37.5% of women marrying in their late-thirties are marrying men who are remarrying. That’s a substantial risk factor for divorce, and one I can’t account for in my analysis (because we don’t have spouse information for divorced women).

On method

Divorce is complicated and inherently longitudinal. Marriages arise out of specific contexts and thrive or decay in many different ways. Yesterday’s crucial influence may disappear today. So how can we say anything about divorce using a single, cross-sectional survey sample? The unsatisfying answer is that all analysis is partial. But these five questions give us a lot to go on, because knowing when a person got married allows us to develop a multidimensional image of the events, as I’ve demonstrated here.

But, you ask, what can we learn from, say, the divorce propensity of today’s 40-year-olds when we know that just last year a whole bunch of 39-year-olds divorced, skewing today’s sample? This is a real issue. And demography provides an answer that is at once partial and powerful: Simple, we use today’s 39-year-olds, too. In the purest form, this approach gives us the life table, in which one year’s mortality rates — at every age — lead to a projection of life expectancy. Another common application is the total fertility rate (watch the video!), which sums birth rates by age to project total births for a generation. In this case I have not produced a complete divorce life table (which I promised a while ago — it’s coming). But the approach is similar.

These are all synthetic cohort approaches (described nicely in the Week 6 lecture slides from this excellent Steven Ruggles course). In this case, the cohorts are age-at-marriage groups. Look at the table above and follow the row for, say, marriages that started at age 28, to see that synthetic cohort’s divorce experience from marriage until age 59. It’s neither a perfect depiction of the past, nor a foolproof prediction of the future. Rather, it tells us what’s happening now in cohort terms that are readily interpretable.

Conclusion

The ACS is the best thing we have for understanding the basic contours of divorce trends and patterns. Those five questions are invaluable.


* For this I also tossed the people who were reported to have married in the current year, because I wasn’t sure about the timing of their marriages and divorces, but I put them back in for the regressions.

** The codebook for my IPUMS data extraction is here, my Stata code is here. The heat-map model here isn’t in that code file, but this these are the commands (and the margins command took a very long time, so please don’t tell me there’s something wrong with it):

logistic divorce i.agemarc#i.mardurc i.degree i.race i.hispan i.citizen
margins i.agemarc#i.mardurc

14 Comments

Filed under Me @ work

The U.S. government asked 2 million Americans one simple question, and their answers will shock you

What is your age?

[SKIP TO THE END for a mystery-partly-solved addendum]

Normally when we teach demography we use population pyramids, which show how much of a population is found at each age. They’re great tools for visualizing population distributions and discussing projections of growth and decline. For example, consider this contrast between Niger and Japan, about as different as we get on earth these days (from this cool site):

japan-niger-pyramids

It’s pretty easy to see the potential for population growth versus decline in these patterns. Finding good pyramids these days is easy, but it’s still good to make some yourself to get a feel for how they work.

So, thinking I might make a video lesson to follow up my blockbuster total fertility rate performance, I gathered some data from the U.S., using the 2013 American Community Survey (ACS) from IPUMS.org. I started with 10-year bins and the total population (not broken out by sex), which looks like this:

totalbinned

There’s the late Baby Boom, still bulging out at ages 50-59 (born 1954-1963), and their kids, ages 20-29. So far so good. But why not use single years of age and show something more precise? Here’s the same data, but showing single years of age:

totalsingleyears

That’s more fine-grained. Not as much as if you had data by months or days of birth, but still. Except, wait: is that just sample noise causing that ragged edge between 20 and about 70? The ACS sample is a few million people, with tens of thousands of people at each age (up age 75, at least), so you wouldn’t expect too much of that. No, it’s definitely age heaping, the tendency of people to skew their age reporting according to some collective cognitive scheme. The most common form is piling up on the ages ending with 0 and 5, but it could be anything. For example, some people might want to be 18, a socially significant milestone in this country. Here’s the same data, with suspect ages highlighted — 0’s and 5’s from 20 to 80, and 18:

totalsingleyearsflagged

You might think age heaping results from some old people not remembering how old they are. In the old days rounding off was more common at older ages. In 1900, for example, the most implausible number of people was found at age 60 — 1.6-times as many as you’d get by averaging the number of people at ages 59 and 61. Is that still the case? Here it is again, but with the red/green highlights just showing the difference between the number of people reported and the number you’d get by averaging the numbers just above and below:

totalsingleyearsflaggedhighlightProportionately, the 70-year-olds are most suspicious, at 10.8% more than you’d expect. But 40 is next, at 9.2%. And that green line shows extra 18-year-olds at 8.6% more than expected.

Unfortunately, it’s pretty hard to correct. Interestingly, the American Community Survey apparently asks for both an age and a birth date:

acs-age

If you’re the kind of person who rounds off to 70, or promotes yourself to 18, it might not be worth the trouble to actually enter a fake birth date. I’m sure the Census Bureau does something with that, like correct obvious errors, but I don’t think they attempt to correct age-heaping in the ACS (the birth dates aren’t on the public use files). Anyway, we can see a little of the social process by looking at different groups of people.

Up till now I’ve been using the full public use data, with population weights, and including those people who left age blank or entered something implausible enough that the Census Bureau gave them an age (an “allocated” value, in survey parlance). For this I just used the unweighted counts of people whose answers were accepted “as written” (or typed, or spoken over the phone, depending on how it was administered to them). Here are the patterns for people who didn’t finish high school versus those with a bachelor’s degree or higher, highlighting the 5’s and 0’s (click to enlarge):

heapingbyeduc

Clearly, the age heaping is more common among those with less education. Whether it’s really people forgetting their age, rounding up or down for aspirational reasons, or having trouble with the survey administration, I don’t know.

Is this bad? As much as we all hate inaccuracy, this isn’t so bad. Fortunately, demographers have methods for assessing the damage caused by humans and their survey-taking foibles. In this case we can use Whipple’s index. This measure (defined in this handy United Nations slideshow) takes the number of people whose alleged ages end in 0 or 5 and multiplies that by 5, then compares it to the total population. Normally people use ages 23 to 62 (inclusive), for an even 40 years. The amount by which people reporting ages 25, 30, 35, 40, 45, 50, 55, and 60 are more than one-fifth of the population ages 23-62, that’s your Whipple’s index. A score of 100 is perfect, and a score of 500 means everyone’s heaped. The U.N. considers scores under 105 to be “very accurate data.” The 2013 ACS, using the public use file and the weights, gives me a score of 104.3. (Those unweighted distributions by education yield scores of 104.0 for high school dropouts and 101.7 for college graduates.) In contrast, the Decennial Census in 2010 had a score of just 101.5 by my calculation (using table QT-P2 from Summary File 1). With the size of the ACS, this difference shouldn’t have to do with sampling variation. Rather, it’s something about the administration of the survey.

Why don’t they just tell us how old they really are? There must be a reason.

Two asides:

  • The age 18 pattern is interesting — I don’t find any research on desirable young-adult ages skewing sample surveys.
  • This is all very different from birth timing issues, such as the Chinese affinity for births in dragon years (every twelfth year: 1976, 1988…). I don’t see anything in the U.S. pattern that fits fluctuations in birth rates.

Mystery-partly-solved addendum

I focused one education above, but another explanation was staring me in the face. I said “it’s something about the administration of the survey,” but didn’t think to check for the form of survey people took. The public use files for ACS include an indicator of whether the household respondent took the survey through the mail (28%), on the web (39%), through a bureaucrat at the institution where they live (group quarters; 5%), or in an interview with a Census worker (28%). This last method, which is either a computer-assisted telephone interview (CATI) or computer-assisted personal interview (CAPI), is used when people don’t respond to the mailed survey.

It turns out that the entire Whipple problem in the 2013 ACS is due to the CATI/CAPI interviews. The age distributions for all of the other three methods have Whipple index scores below 100, while the CATI/CAPI folks clock in at a whopping 108.3. Here is that distribution, again using unweighted cases:

caticapiacs

There they are, your Whipple participants. Who are they, and why does this happen? Here is the Bureau’s description of the survey data collection:

The data collection operation for housing units (HUs) consists of four modes: Internet, mail, telephone, and personal visit. For most HUs, the first phase includes a mailed request to respond via Internet, followed later by an option to complete a paper questionnaire and return it by mail. If no response is received by mail or Internet, the Census Bureau follows up with computer assisted telephone interviewing (CATI) when a telephone number is available. If the Census Bureau is unable to reach an occupant using CATI, or if the household refuses to participate, the address may be selected for computer-assisted personal interviewing (CAPI).

So the CATI/CAPI people are those who were either difficult to reach or were uncooperative when contacted. This group, incidentally, has low average education, as 63% have high school education or less (compared with 55% of the total) — which may explain the association with education. Maybe they have less accurate recall, or maybe they are less cooperative, which makes sense if they didn’t want to do the survey in the first place (which they are legally mandated — i.e., coerced — to do). So when their date of birth and age conflict, and the Census worker tries to elicit a correction, maybe all hell breaks lose in the interview and they can’t work it out. Or maybe the CATI/CAPI households have more people who don’t know each other’s exact ages (one person answers for the household). I don’t know. But this narrows it down considerably.

6 Comments

Filed under Research reports

On Goffman’s survey

Survey methods.

Survey methods.

Jesse Singal at New York Magazine‘s Science of Us has a piece in which he tracks down and interviews a number of Alice Goffman’s respondents. This settles the question — which never should have been a real question — about whether she actually did all that deeply embedded ethnography in Philadelphia. It leaves completely unresolved, however, the issue of the errors and possible errors in the research. This reaffirms for me the conclusion in my original review that we should take the volume down in this discussion, identify errors in the research without trying to attack Goffman personally or delegitimize her career — and then learn from the affair ways that we can improve sociology (for example, by requiring that winners of the American Sociological Association dissertation award make their work publicly available).

That said, I want to comment on a couple of issues raised in Singal’s piece, and share my draft of a formal comment on the survey research Goffman reported in American Sociological Review.

First, I want to distance myself from the description by Singal of “lawyers and journalists and rival academics who all stand to benefit in various ways if they can show that On the Run doesn’t fully hold up.” I don’t see how I (or any other sociologists) benefit if Goffman’s research does not hold up. In fact, although some people think this is worth pursuing, I am also annoying some friends and colleagues by doing this.

More importantly, although it’s a small part of the article, Singal did ask Goffman about the critique of her survey, and her response (as he paraphrased it, anyway) was not satisfying to me:

Philip Cohen, a sociologist at the University of Maryland, published a blog post in which he puzzles over the strange results of a door-to-door survey Goffman says she conducted with Chuck in 2007 in On the Run. The results are implausible in a number of ways. But Goffman explained to me that this wasn’t a regular survey; it was an ethnographic survey, which involves different sampling methods and different definitions of who is and isn’t in a household. The whole point, she said, was to capture people who are rendered invisible by traditional survey methods. (Goffman said an error in the American Sociological Review paper that became On the Run is causing some of the confusion — a reference to “the 217 households that make up the 6th Street neighborhood” that should have read “the 217 households that we interviewed … ” [emphasis mine]. It’s a fix that addresses some of Cohen’s concerns, like an implied and very unlikely 100 percent response rate, but not all of them.) “I should have included a second appendix on the survey in the book,” said Goffman. “If I could do it over again, I would.”

My responses are several. First, the error of describing the 217 households as the whole neighborhood, as well as the error in the book of saying she interviewed all 308 men (when in the ASR article she reports some unknown number were absent), both go in the direction of inflating the value and quality of the survey. Maybe they are random errors, but they didn’t have a random effect.

Second, I don’t see a difference between a “regular survey” and an “ethnographic survey.” There are different survey techniques for different applications, and the techniques used determine the data and conclusions that follow. For example, in the ASR article Goffman uses the survey (rather than Census data) to report the racial composition of the neighborhood, which is not something you can do with a convenience sample, regardless of whether you are engaged in an ethnography or not.

Finally, there are no people “rendered invisible by traditional survey methods” (presumably Singal’s phrase). There are surveys that are better or worse at including people in different situations. There are “traditional” surveys — of varying quality — of homeless people, prisoners, rape victims, and illiterate peasants. I don’t know what an “ethnographic survey” is, but I don’t see why it shouldn’t include a sampling strategy, a response rate, a survey instrument, a data sharing arrangement, and thorough documentation of procedures. That second methodological appendix can be published at any time.

ASR Comment (revised June 22)

I wrote up my relatively narrow, but serious, concerns about the survey, and posted them on my website here.

It strikes me that Goffman’s book (either the University of Chicago Press version or the trade book version) may not be subject to the same level of scrutiny that her article in ASR should have been. In fact, presumably, the book publishers took her publication in ASR as evidence of the work’s quality. And their interests are different from those of a scientific journal run by an academic society. If ASR is going to play that gatekeeping role, and it should, then ASR (and by extension ASA) should take responsibility in print for errors in its publications.

11 Comments

Filed under Research reports

On the ropes (Goffman review)

westphilly

First a short overview, then my comments.

Alice Goffman’s book On the Run: Fugitive Life in an American City, is one of the major events in sociology of the last few years. How unusual is it for a book based on a sociology dissertation to get this treatment?

Cornel West endorsed it as “the best treatment I know of the wretched underside of neoliberal capitalist America.”  Writing in theNew York Times Book Review, Alex Kotlowitz said it was “a remarkable feat of reporting” with an “astonishing” level of detail and honesty.  The New Yorker’s Malcolm Gladwell called it “extraordinary,” and Christopher Jencks, in the New York Review of Books, predicted that it would “become an ethnographic classic.”  Tim Newburn, a highly regarded criminologist at the London School of Economics, hailed On the Run as “gloriously readable” and “sociology at its best.”

On the other hand, the book has lots of critics — here’s James Forman in the Atlantic — who think Goffman’s research subjects aren’t representative of the poor Black communities she wants to describe. And then there’s the exploitation/privilege/outsider argument, summarized by Claude Fischer (I added the links):

A typical line of criticism charges that outsiders cannot accurately describe their subjects of study. For example, one highly circulated review of Goffman’s book, alarmed at her white privilege, describes the study as “theft,” abetting “fantasies of black pathology,” and possibly causing harm by revealing to police the tricks of hiding. “Inner city Philadelphia isn’t Alice Goffman’s home,” another reviewer writes, “and it’s not her job to turn it into a jungle that needs interpreting.” A Buzzfeed writer simply tweeted, “Ban outsider ethnographies.”

Comments

On the morals and ethics, I’m not going to draw conclusions, except to say that I agree Goffman was wrong to help try to find and kill the guy who killed her friend, if what she says is true.

On the social science, I also have a limited perspective, because I don’t really read the book as social science; I think it’s much more a sociological memoir — and I don’t mean that as a criticism of ethnography, to which I would not apply that label in general.

In fact, the book is least persuasive when she tries to be most dispassionate. On the issue of representativeness, for example, Goffman clearly is wrong when she writes:

Initially I assumed that Chuck, Mike, and their friends represented an outlying group of delinquents: the bad apples of the neighborhood. After all, some of them occasionally sold marijuana and crack cocaine to local customers, and sometimes they even got into violent gun battles. I grew to understand that many young men from 6th Street were at least intermittently earning money by selling drugs, and the criminal justice entanglements of Chuck and his friends were on a par with what many other unemployed young men in the neighborhood were experiencing.

That’s a non sequitur, with the typical slippery use of “many” (which I have probably fallen into myself). The fact is her entire project was shaped by the guy she initially fell in with, and his friends, and they were by any measure an “outlying group.” If you don’t see people who have multiple running gun battles as unusual, your perspective may be a little skewed.

Patrick Sharkey’s review puts their outlier status in the context of declining violent crime — including in Philadelphia’s most violent neighborhoods:

The decision to engage in violence can be thought of as a rational response to the pressures and threats that young men perceive in this environment. But the decision to fire a gun in public space is one that is now universally rejected by every segment of American society, and it is a decision that comes with clear, long-term consequences that are understood by all of the young men in the neighborhood that Goffman studies. … As callous as it sounds given the hardships faced by this young man and the lack of choices available to him in this situation, my sense is that most residents of the block where he chose to fire his weapon would consider attempted murder to be an appropriate charge and would have been happy if he were located by the police and sentenced to prison.

In a number of passages it’s impossible to differentiate what Goffman knows versus what she was told. When the facts are wrong, that’s unfortunate. For example, she writes, “By the time Chuck entered his senior year of high school in 2002, young women outnumbered young men in his classes by more than 2:1.” Was that his perspective? People are reliably bad at estimating group sizes. The actual ratio of women to men among Black 17-year-olds living below the poverty line and enrolled in school in Philadelphia was skewed, but only 1.65-to-1.

Nevertheless, her description of the many ways the incarceration empire impinges on the daily lives of poor Black people in Philadelphia is sometimes insightful and useful. I mostly agree with Goffman’s political description and conclusions about the injustices here, and if the book does some harm it also will do some good. I would be happy to see the volume come down in this discussion and for us to treat this is a regular book — despite the inordinate attention it gets outside of academia. It’s deeply flawed but it’s worth reading. Its systematic evidence is weak but it’s thought-provoking and offers lots of food for thought in research and policy debates. It’s well-written and its topic is important. I have nothing against her and look forward to what she will do next.

That said…

My extremely shallow expertise in qualitative research is unfortunately not balanced by a vast knowledge of survey research methods. But that is my relative strength here, and on that I’d like to register an objection to Goffman’s study — both the book and her 2009 article in American Sociological Review (official link here, Google for a free copy).

There is an understandable problem with reproducibility even in the best ethnographies. As evidence for her conclusions, I simply discount her ostensibly meticulous counting of events, like this:

In that same eighteen-month period, I watched the police break down doors, search houses, and question, arrest, or chase people through houses fifty-two times. Nine times, police helicopters circled overhead and beamed searchlights onto local streets. I noted blocks taped off and traffic redirected as police searched for evidence— or, in police language, secured a crime scene— seventeen times. Fourteen times during my first eighteen months of near daily observation, I watched the police punch, choke, kick, stomp on, or beat young men with their nightsticks.

To me this just means “a lot.” First, there is no denominator. That is, no way to gauge how prevalent these events were compared with anything else. “A lot” is a fine metric for this kind of observation, because all the insights come from the details that follow, not from the recitation of frequencies. (There also is a frustrating lack of precision in these passages. Consider: “I watched the police break down doors, search houses, and question, arrest, or chase people through houses fifty-two times.” What exactly happened 52 times? What do “and” and “or” mean in that sentence?) However, if you think of it as sociological memoir, those numbers mean something because they tell you about her evolving perspective and experience. Wow, I think, if I witnessed 14 police beatings first-hand that would really affect me.

That part of the study is not reproducible. But if you do a survey while you’re doing an ethnography, you’re still doing a survey. The survey is not an ethnography. That means you should (my opinion, not a rule) make available your instrument, methods employed, and the data. Goffman has said she destroyed her field notes so they couldn’t be subpoenaed, but I don’t see why that should apply to her survey data, which could have been stripped of identifying information and processed like any other survey with sensitive material. Unlike the recitation of event counts, the survey is potentially reproducible. She reports (a few) percentages. Other researchers could conceivably conduct similar research in a different place or time and draw useful comparisons; someone might even attempt to reproduce her survey just to see if they get the same result.

Let me back up. I can, without violating the copyright rules regarding quotation length, reproduce everything she wrote about the survey portion of her study. (The much remarked upon 50-page “methodological note” at the end of the book never mentions the survey.) Here I will compare what she said in ASR versus the book, not as a gotcha exercise but because there is so little information in either that the variation between the two may be informative.

In ASR (passages excerpted across several pages of the article):

The five blocks known as 6th Street are 93 percent Black, according to a survey of residents that Chuck and I conducted in 2007. … Of the 217 households surveyed, roughly one fourth received housing vouchers. In all but two households, members reported receiving some type of government assistance in the past three years. … In the survey that Chuck and I conducted in 2007, of the 217 households that make up the 6th Street neighborhood, we found 308 men between the ages of 18 and 30 in residence. Of these men, 144 reported that they had a warrant issued for their arrest because of either delinquencies with court fines and fees or for failure to appear for a court date within the past three years. Also within the past three years, warrants had been issued to 119 men for technical violations of their probation or parole (e.g., drinking or breaking curfew).

The footnote to that last passage clarifies that “in residence” doesn’t really mean living there:

I counted men who lived in a house for three days a week or more (by their own estimates and in some cases, my knowledge) as members of the household. I included men who were absent because they were in the military, at job training programs (like JobCorp), or away in jail, prison, drug rehab centers, or halfway houses, if they expected to return to the house and had been living in the house before they went away.

From this we learn that the survey included 217 households — and also that those 217 households make up the entire 6th Street neighborhood (not its real name). That seems to imply a 100% response rate, because the 217 households surveyed corresponds to the 217 that “make up” the neighborhood. (That is an excellent response rate.) That is also important because of what comes next. If there were 217 households in the neighborhood, but she had only done interviews with, say, half of them, it would have been very surprising to find 308 men ages 18-30 living there. As it is, it’s extremely unlikely there were 308 men 18-30 living in all 217 households. The footnote says she included men who only lived there part time, or who were away for prison or other institutional spells. Frustratingly, it doesn’t say how many in the sample this applied to.

Here’s what tripped me up about that: 308 men 18-30 in 217 households is 1.4 per household — too many, I thought. So I looked at the 45 census tracts in West Philadelphia that are 75% Black or more (based on this map) in the 2005-2009 American Community Survey, and calculated the average number of men in that age range as .57 per household. (Actually I did the estimate for 18-29 because of how the data are published.) There is one tract with 1.04 men per household, but that’s an outlier. If she’s getting almost 2.5-times more than the average in her sample (1.4 versus .57), there must be a lot of missing men living in the neighborhoods not counted by the Census Bureau. Besides wondering if that’s accurate, you also ought to wonder who answered the questions about those men who weren’t actually residing in their residences.

The book’s version, unfortunately, doesn’t help clear that up:

In 2007 Chuck and I went door to door and conducted a household survey of the 6th Street neighborhood. We interviewed 308 men between the ages of eighteen and thirty. Of these young men, 144 reported that they had a warrant issued for their arrest because of either delinquencies with court fines and fees or failure to appear for a court date within the previous three years. For that same period, 119 men reported that they had been issued warrants for technical violations of their probation or parole (for example, drinking or breaking curfew).

This says they interviewed all 308 men, which seems like just an editing mistake. But then of the 144 who “reported that they had a warrant,” was that out of 308 or out of some smaller number of men whom they actually interviewed? Did someone else answer for the men who weren’t there?

Later in the book she describes the women she interviewed in the same survey:

In our 2007 household survey of the 6th Street neighborhood, 139 of 146 women reported that in the past three years, a partner, neighbor, or close male relative was either wanted by the police, serving a probation or parole sentence, going through a trial, living in a halfway house, or on house arrest. Of the women we interviewed, 67 percent said that during that same period, the police had pressured them to provide information about that person.

If she had limited the sample of women to the same age range as the men (which there’s no indication she did) then you would expect, based on the Black West Philadelphia census tracts, about 1.2 men for every woman, or about 250 women in the sample. However, if she included women of all ages, it should have been even more. Why are there only 146 women? Without any more information — and there is no more information in the book — it’s impossible to figure this out.

The last thing you might like to know if you wanted to pursue these very interesting and potentially important results, is what the survey instrument was like, and how it was administered. This was more than 450 people, a decent size survey, with good potential to yield useful results. We know they asked race, housing and other government assistance, warrant status (including the type of warrant) for the men, and criminal justice status for the partner/neighbor/relative (asked separately?) of the women. What strikes me as challenging here is asking these very sensitive questions and getting such a high response rate. Especially given all we learn from the book, knocking on doors and asking people if they have any warrants seems like it wouldn’t always be welcome. How long were the interviews? Did they make multiple visits when people weren’t home? So knowing how they did it would be very helpful for future work.

In the end, besides what we learn from On the Run itself, I hope we learn from the debate over it how we can better balance the need for protecting research subjects — while learning a lot from them — with the imperative to conduct research that is transparent, verifiable, and (as much as possible) reproducible.

ADDENDUM: The importance of the survey

The survey plays a very small part in the book and ASR article, only mentioned a few times, but it is important because it does offer a  hint of generalizability to her research. James Forman wrote in the Atlantic:

The fact that Goffman’s subjects have serious criminal histories impairs our ability to generalize from some of her findings. For example, her central characters are all wanted on warrants at one time or another, some of them repeatedly (Mike has 10 warrants altogether). To Goffman, this indicates that Philadelphia’s criminal-justice system issues too many warrants. But it may simply indicate that Mike and his friends are unusually criminally active.

Perhaps anticipating this challenge, Goffman extends her inquiry beyond the most criminally active members of the community. When she conducts a door-to-door survey of Sixth Street, she finds that about half the men there were wanted on warrants over a three-year period. This is astounding; no previous researcher has reported such a high concentration of fugitives living in one community. This raises questions that Goffman doesn’t answer with precision, but that I hope she and others will explore in the future: How many of these warrants were for failure to pay court costs—which should rarely if ever be imposed on poor people in the first place—versus something more serious, such as skipping a court date? Does fugitive status affect the lives of less criminally involved young men in the same ways it affects the lives of Mike and his friends? If it does, and if other communities harbor equally large proportions of fugitives, Goffman has discovered a profound social problem that deserves further research and a policy response.

The survey thus represents an important avenue for the book’s impact on future research.

ADDITIONAL ADDENDUM: Goffman’s response

In a statement on her website, Goffman responds to several of the recent criticisms. The first is that she participated in a conspiracy to commit murder. Her description there doesn’t change my impression of the scene she describes. But it is interesting for what it reveals about her sense of the men at the core of her story: Chuck and Mike. She writes:

In the months before he died, Chuck was actively working to preserve a precarious peace between his friends and a rival group living nearby. His sudden death was a devastating blow not only to his friends and family, but to the whole neighborhood. After Chuck was shot and killed, people in the neighborhood were putting a lot of pressure on Mike and on Chuck’s other friends to avenge his murder. It seemed that Chuck’s friends were expected to fulfill the neighborhood’s collective desire for retribution. Many of the residents were emphatic that justice should be served, and the man who killed Chuck must pay.

This surprises me, because I did not get the sense from the book that the gang rivalry Goffman described had motivated a collective conflict between entire neighborhoods. That contradicts the comment from Patrick Sharkey above, which speculates that most people in the neighborhood would have been glad to see the police arrest and incarcerate the people who conduct running gun battles in the street. Maybe Skarkey and I are victims of do-gooder liberal attitudes, and really there are neighborhoods in Philadelphia that see themselves at war with whole other neighborhoods and clamor for more shooting.

As an aside, I recently read Ghettoside: A True Story of Murder in America, in which journalist Jill Leovy argues that poor Black communities are at once over-policed — as innocent people are harassed, violated, prosecuted, and incarcerated for minor (or no) offenses — and under-policed such that murderers in Black communities are much less likely to be arrested and convicted than are those who kill Whites. This latter neglect by police contributes to the real problem of violence because it encourages informal (violent) means of addressing conflict in the community. By Leovy’s reasoning, and Sharkey might agree, majorities of people in violent Black neighborhoods would like fewer drug users incarcerated, but more of the people like Mike and whoever killed Chuck incarcerated.

Of course, I don’t know the situation in the “6th Street” neighborhood. But it’s one thing for an activist or community leader to say, “Our neighborhood stands united!”, and another for a sociologist to speak of “the neighborhood’s collective desire.” The former may be effective politics, but the latter is likely an overly-simplified research conclusion.

Which brings me finally back to the issue of Goffman’s survey. In her response, Goffman says Chuck was actively working on neighborhood peace in the months before his death, which occurred in the summer of 2007. Remarkably, that is the same time that she and Chuck conducted their survey of all 217 households in the neighborhood, with a reported 454 interviews. The timing of the survey is not completely clear, but she wrote that it was in 2007. and in an interview she said the survey was in the summer (“Yeah, so we did this survey, Chuck and I, one summer. We interviewed the households in this four-block radius…”). So it’s either a head-scratching story or an incredibly impressive image: Chuck and Goffman conducting a door-to-door household survey at the same time that he’s negotiating a delicate gang truce, interviewing every single household in the neighborhood in the weeks leading up to Chuck’s death, gathering the information that would allow Goffman to in fact accurately speak of their “collective desire.”

46 Comments

Filed under Research reports

What’s in a ratio? Teen birth and marriage edition

Even in our post-apocalypse world, births and marriages are still related, somehow.

Some teenage women get married, and some have babies. Are they the same women? First the relationship between the two across states, then a puzzle.

In the years 2008-2012 combined, 2.5 percent of women ages 15-19 per year had a baby, and 1 percent got married. That is, they were reported in the American Community Survey (IPUMS) to have given birth, or gotten married, in the 12 months before they were surveyed. Here’s the relationship between those two rates across states:

teenbirthmarriage1The teen birth rate  ranges from a low of 1.2 percent in New Hampshire to 4.4 percent in New Mexico. The teen marriage rate ranges from .13 percent in Vermont to 2.3 percent in Idaho.

But how much of these weddings are “shotgun weddings” — those where the marriage takes place after the pregnancy begins? And how many of these births are “gungo-ho marriages” — those where the pregnancy follows immediately after the marriage? (OK, I made that term up.) The ACS, which is wonderful for having these questions, is somewhat maddening in not nailing down the timing more precisely. “In the past 12 months” is all you get.

Here is the relationship between two ratios. The x-axis is percentage of teens who got married who also had a birth (birth/marriage). On the y-axis is the percent of teens who had a birth who also got married (marriage/birth).

teenbirthmarriageIf you can figure out how to interpret these numbers, and the difference between them within states, please post your answer in the comments.

 

 

 

5 Comments

Filed under Me @ work