Category Archives: Research reports

No, you should get married in your late 40s (just kidding)

Please don’t give (or take) stupid advice from analyses like this.

Since yesterday, Nick Wolfinger and Brad Wilcox have gotten their marriage age analysis into the Washington Post Wonkblog (“The best age to get married if you don’t want to get divorced”) and Slate (“The Goldilocks Theory of Marriage”). The marriage-promotion point of this is: don’t delay marriage. The credulous blogosphere can’t resist the clickbait, but the basis for this is very weak.

Yesterday I complained about Wolfinger pumping up the figure he first posted (left) into the one on the right:

wolfbothToday I spent a few minutes analyzing the American Community Survey (ACS) to check this out. Wolfinger has not shared his code, data, models, or tables, so it’s hard to know what he really did. However, he lists a number of variables he says he controlled for using the National Survey of Family Growth: “sex, race, family structure of origin, age at the time of the survey, education, religious tradition, religious attendance, and sexual history, as well as the size of the metropolitan area.”

The ACS seems better for this. It’s very big, so I can analyze just the one-year incidence of divorce (did you get divorced in the last year?), according to the age at which people married. I don’t have family structure of origin, religion, or sexual history, but he says those don’t influence the age-at-marriage effect much. He did not control for duration of marriage, which is messed up in his data anyway because of the age limits in the NSFG.

So, in my model I used women in their first marriages only, and controlled for marriage duration, education, race, Hispanic ethnicity, and nativity/citizenship. This is similar to models I used in this (shock) peer-reviewed paper. Here are the predicted probabilities of divorce, in one year, holding those control variables constant.

agemar-divorce

Yes, there is a little bump up for the late 30s compared with the early 30s, but it’s very small.

Closer analysis (added to the post 7/19), generated from a model with age-at-marriage–x–marital duration interactions, shows that the late-30s bump is concentrated in the first five years of marriage:

newheatmap

This doesn’t much undermine the “conventional wisdom” that early marriage increases the risk of divorce. Of course, this should not be the basis for advice to people who are, say, dating a person they’re thinking of marrying and hoping to minimize chance of divorce.

If you want to give advice to, say, a 15-year-old woman, however, the bottom line is still: Get a bachelor’s degree. You’ll likely earn more, marry later, and have fewer kids. If you or your spouse decide to get divorced after all that, it won’t hurt that you’re more independent. For what it’s worth, here are the education effects from this same model:

educ-div

(The codebook for my IPUMS data extraction is here, my Stata code is here.)

Anyway, it’s disappointing to see this in the Wonkblog piece:

But the important thing, for Wolfinger, is that “we do know beyond a shadow of a doubt that people who marry in their thirties are now at greater risk of divorce than are people who wed in their late twenties. This is a new development.”

That’s just not true. I wouldn’t swear by this quick model I did today. But I would swear that it’s too early to change the “conventional wisdom” based only on a blog post on a Brad-Wilcox-branded site.

Aside

One interesting issue is the problem of age at marriage and education. They are clearly endogenous — that is, they influence each other. Women delay marriage to get more education, they stop their education when they have kids, they go back to school when they get divorced — or think they might get divorced. And so on. And, for the regression models, there are no highly-educated people getting married at really young ages, because they haven’t finished school yet. On the other hand, though, there are lots of less-educated people getting married for the first time at older ages. Using the same ACS data, here are two looks at the women who just married for the first time, by age and education.

First, the total number per year:

age-ed-mar-count

Then, the percent distribution of that same data:

age-ed-mar-distInteresting thing here is that college graduates are only the majority of women getting married for the first time in the age range 27-33. Before and after that most women have less than a BA when they marry for the first time. This is also complicated because the things that select people into early marriage are sometimes but not always different from those that select people into higher education. Whew.

It really may not be reasonable to try to isolate the age-at-marriage effect after all.

22 Comments

Filed under Research reports

The latest get-married-young thing tells you all you need to know

Just a quick note for people wondering about this new thing by Nicholas Wolfinger on Brad Wilcox’s blog. He says it used to be (before 1995) that getting married young increased the odds of divorce. Since then, however, he says getting married either before or after age 32 raises the odds of divorce.

Why is that? His explanation — in his very own words, from his very own post: “my money is on a selection effect.” In other words, do not follow the advice in the headline, which is: “Want to Avoid Divorce? Wait to Get Married, But Not Too Long.” Because if the mechanism is selection, then changing your behavior to ride that curve will not work.

I’m not getting into the methods, which are not revealed, despite a link for “more information” — there is no paper, no tables, no code or data. However, something is off, and the post is off-gassing a discernible essence of Wilcox’s influence. In the new blog post, they show this graph:

wolfinger1Wow, that’s a pretty big boomerang effect. If it weren’t a selection effect, it might really be relevant for personal decision-making. But when you follow the link for “more information” you see this graph:

wolfinger2

The upward swing here is hardly enough to get your marriage promotion lather up. Clearly, something had to be improved from Wolfinger’s post from April and his post for Wilcox’s site in July. That’s the kind of data leadership we expect from this site. (Also, get rid of those dots, which show you the all those people with really low divorce odds at higher ages.)

Related:

2 Comments

Filed under Research reports

The U.S. government asked 2 million Americans one simple question, and their answers will shock you

What is your age?

[SKIP TO THE END for a mystery-partly-solved addendum]

Normally when we teach demography we use population pyramids, which show how much of a population is found at each age. They’re great tools for visualizing population distributions and discussing projections of growth and decline. For example, consider this contrast between Niger and Japan, about as different as we get on earth these days (from this cool site):

japan-niger-pyramids

It’s pretty easy to see the potential for population growth versus decline in these patterns. Finding good pyramids these days is easy, but it’s still good to make some yourself to get a feel for how they work.

So, thinking I might make a video lesson to follow up my blockbuster total fertility rate performance, I gathered some data from the U.S., using the 2013 American Community Survey (ACS) from IPUMS.org. I started with 10-year bins and the total population (not broken out by sex), which looks like this:

totalbinned

There’s the late Baby Boom, still bulging out at ages 50-59 (born 1954-1963), and their kids, ages 20-29. So far so good. But why not use single years of age and show something more precise? Here’s the same data, but showing single years of age:

totalsingleyears

That’s more fine-grained. Not as much as if you had data by months or days of birth, but still. Except, wait: is that just sample noise causing that ragged edge between 20 and about 70? The ACS sample is a few million people, with tens of thousands of people at each age (up age 75, at least), so you wouldn’t expect too much of that. No, it’s definitely age heaping, the tendency of people to skew their age reporting according to some collective cognitive scheme. The most common form is piling up on the ages ending with 0 and 5, but it could be anything. For example, some people might want to be 18, a socially significant milestone in this country. Here’s the same data, with suspect ages highlighted — 0’s and 5’s from 20 to 80, and 18:

totalsingleyearsflagged

You might think age heaping results from some old people not remembering how old they are. In the old days rounding off was more common at older ages. In 1900, for example, the most implausible number of people was found at age 60 — 1.6-times as many as you’d get by averaging the number of people at ages 59 and 61. Is that still the case? Here it is again, but with the red/green highlights just showing the difference between the number of people reported and the number you’d get by averaging the numbers just above and below:

totalsingleyearsflaggedhighlightProportionately, the 70-year-olds are most suspicious, at 10.8% more than you’d expect. But 40 is next, at 9.2%. And that green line shows extra 18-year-olds at 8.6% more than expected.

Unfortunately, it’s pretty hard to correct. Interestingly, the American Community Survey apparently asks for both an age and a birth date:

acs-age

If you’re the kind of person who rounds off to 70, or promotes yourself to 18, it might not be worth the trouble to actually enter a fake birth date. I’m sure the Census Bureau does something with that, like correct obvious errors, but I don’t think they attempt to correct age-heaping in the ACS (the birth dates aren’t on the public use files). Anyway, we can see a little of the social process by looking at different groups of people.

Up till now I’ve been using the full public use data, with population weights, and including those people who left age blank or entered something implausible enough that the Census Bureau gave them an age (an “allocated” value, in survey parlance). For this I just used the unweighted counts of people whose answers were accepted “as written” (or typed, or spoken over the phone, depending on how it was administered to them). Here are the patterns for people who didn’t finish high school versus those with a bachelor’s degree or higher, highlighting the 5’s and 0’s (click to enlarge):

heapingbyeduc

Clearly, the age heaping is more common among those with less education. Whether it’s really people forgetting their age, rounding up or down for aspirational reasons, or having trouble with the survey administration, I don’t know.

Is this bad? As much as we all hate inaccuracy, this isn’t so bad. Fortunately, demographers have methods for assessing the damage caused by humans and their survey-taking foibles. In this case we can use Whipple’s index. This measure (defined in this handy United Nations slideshow) takes the number of people whose alleged ages end in 0 or 5 and multiplies that by 5, then compares it to the total population. Normally people use ages 23 to 62 (inclusive), for an even 40 years. The amount by which people reporting ages 25, 30, 35, 40, 45, 50, 55, and 60 are more than one-fifth of the population ages 23-62, that’s your Whipple’s index. A score of 100 is perfect, and a score of 500 means everyone’s heaped. The U.N. considers scores under 105 to be “very accurate data.” The 2013 ACS, using the public use file and the weights, gives me a score of 104.3. (Those unweighted distributions by education yield scores of 104.0 for high school dropouts and 101.7 for college graduates.) In contrast, the Decennial Census in 2010 had a score of just 101.5 by my calculation (using table QT-P2 from Summary File 1). With the size of the ACS, this difference shouldn’t have to do with sampling variation. Rather, it’s something about the administration of the survey.

Why don’t they just tell us how old they really are? There must be a reason.

Two asides:

  • The age 18 pattern is interesting — I don’t find any research on desirable young-adult ages skewing sample surveys.
  • This is all very different from birth timing issues, such as the Chinese affinity for births in dragon years (every twelfth year: 1976, 1988…). I don’t see anything in the U.S. pattern that fits fluctuations in birth rates.

Mystery-partly-solved addendum

I focused one education above, but another explanation was staring me in the face. I said “it’s something about the administration of the survey,” but didn’t think to check for the form of survey people took. The public use files for ACS include an indicator of whether the household respondent took the survey through the mail (28%), on the web (39%), through a bureaucrat at the institution where they live (group quarters; 5%), or in an interview with a Census worker (28%). This last method, which is either a computer-assisted telephone interview (CATI) or computer-assisted personal interview (CAPI), is used when people don’t respond to the mailed survey.

It turns out that the entire Whipple problem in the 2013 ACS is due to the CATI/CAPI interviews. The age distributions for all of the other three methods have Whipple index scores below 100, while the CATI/CAPI folks clock in at a whopping 108.3. Here is that distribution, again using unweighted cases:

caticapiacs

There they are, your Whipple participants. Who are they, and why does this happen? Here is the Bureau’s description of the survey data collection:

The data collection operation for housing units (HUs) consists of four modes: Internet, mail, telephone, and personal visit. For most HUs, the first phase includes a mailed request to respond via Internet, followed later by an option to complete a paper questionnaire and return it by mail. If no response is received by mail or Internet, the Census Bureau follows up with computer assisted telephone interviewing (CATI) when a telephone number is available. If the Census Bureau is unable to reach an occupant using CATI, or if the household refuses to participate, the address may be selected for computer-assisted personal interviewing (CAPI).

So the CATI/CAPI people are those who were either difficult to reach or were uncooperative when contacted. This group, incidentally, has low average education, as 63% have high school education or less (compared with 55% of the total) — which may explain the association with education. Maybe they have less accurate recall, or maybe they are less cooperative, which makes sense if they didn’t want to do the survey in the first place (which they are legally mandated — i.e., coerced — to do). So when their date of birth and age conflict, and the Census worker tries to elicit a correction, maybe all hell breaks lose in the interview and they can’t work it out. Or maybe the CATI/CAPI households have more people who don’t know each other’s exact ages (one person answers for the household). I don’t know. But this narrows it down considerably.

6 Comments

Filed under Research reports

The total fertility rate, with instructions, in 9 minutes

Maybe because I haven’t had a classroom full of students since December, I made an instructional video.

In 9 minutes I explain what the total fertility rate is and then illustrate how to get the data you need to calculate it using IPUMS’s American Community Survey analysis tool. In the dramatic last five minutes we calculate the TFR for the United States in 2013, and match the official number. Wow. And you thought your holiday weekend was going to be fun already.

I want more people to have a hands-on feel for basic demography, and to realize how easy it is, and how accessible, with the tools we have nowadays. So, this is for students, non-demographic researchers, and journalists.

The video:

And here’s the end product (a little touched up):

tfr2013Check it out if you’re having trouble sleeping.

4 Comments

Filed under Research reports

On Goffman’s survey

Survey methods.

Survey methods.

Jesse Singal at New York Magazine‘s Science of Us has a piece in which he tracks down and interviews a number of Alice Goffman’s respondents. This settles the question — which never should have been a real question — about whether she actually did all that deeply embedded ethnography in Philadelphia. It leaves completely unresolved, however, the issue of the errors and possible errors in the research. This reaffirms for me the conclusion in my original review that we should take the volume down in this discussion, identify errors in the research without trying to attack Goffman personally or delegitimize her career — and then learn from the affair ways that we can improve sociology (for example, by requiring that winners of the American Sociological Association dissertation award make their work publicly available).

That said, I want to comment on a couple of issues raised in Singal’s piece, and share my draft of a formal comment on the survey research Goffman reported in American Sociological Review.

First, I want to distance myself from the description by Singal of “lawyers and journalists and rival academics who all stand to benefit in various ways if they can show that On the Run doesn’t fully hold up.” I don’t see how I (or any other sociologists) benefit if Goffman’s research does not hold up. In fact, although some people think this is worth pursuing, I am also annoying some friends and colleagues by doing this.

More importantly, although it’s a small part of the article, Singal did ask Goffman about the critique of her survey, and her response (as he paraphrased it, anyway) was not satisfying to me:

Philip Cohen, a sociologist at the University of Maryland, published a blog post in which he puzzles over the strange results of a door-to-door survey Goffman says she conducted with Chuck in 2007 in On the Run. The results are implausible in a number of ways. But Goffman explained to me that this wasn’t a regular survey; it was an ethnographic survey, which involves different sampling methods and different definitions of who is and isn’t in a household. The whole point, she said, was to capture people who are rendered invisible by traditional survey methods. (Goffman said an error in the American Sociological Review paper that became On the Run is causing some of the confusion — a reference to “the 217 households that make up the 6th Street neighborhood” that should have read “the 217 households that we interviewed … ” [emphasis mine]. It’s a fix that addresses some of Cohen’s concerns, like an implied and very unlikely 100 percent response rate, but not all of them.) “I should have included a second appendix on the survey in the book,” said Goffman. “If I could do it over again, I would.”

My responses are several. First, the error of describing the 217 households as the whole neighborhood, as well as the error in the book of saying she interviewed all 308 men (when in the ASR article she reports some unknown number were absent), both go in the direction of inflating the value and quality of the survey. Maybe they are random errors, but they didn’t have a random effect.

Second, I don’t see a difference between a “regular survey” and an “ethnographic survey.” There are different survey techniques for different applications, and the techniques used determine the data and conclusions that follow. For example, in the ASR article Goffman uses the survey (rather than Census data) to report the racial composition of the neighborhood, which is not something you can do with a convenience sample, regardless of whether you are engaged in an ethnography or not.

Finally, there are no people “rendered invisible by traditional survey methods” (presumably Singal’s phrase). There are surveys that are better or worse at including people in different situations. There are “traditional” surveys — of varying quality — of homeless people, prisoners, rape victims, and illiterate peasants. I don’t know what an “ethnographic survey” is, but I don’t see why it shouldn’t include a sampling strategy, a response rate, a survey instrument, a data sharing arrangement, and thorough documentation of procedures. That second methodological appendix can be published at any time.

ASR Comment (revised June 22)

I wrote up my relatively narrow, but serious, concerns about the survey, and posted them on my website here.

It strikes me that Goffman’s book (either the University of Chicago Press version or the trade book version) may not be subject to the same level of scrutiny that her article in ASR should have been. In fact, presumably, the book publishers took her publication in ASR as evidence of the work’s quality. And their interests are different from those of a scientific journal run by an academic society. If ASR is going to play that gatekeeping role, and it should, then ASR (and by extension ASA) should take responsibility in print for errors in its publications.

10 Comments

Filed under Research reports

More fathers married when their first child is born? Probably not

A startling data brief from the National Center for Health Statistics reports that the percentage of fathers who weren’t married at the time of their first births fell from the 1980s to the 2000s. Here is the first “key finding”: “The percentage of fathers aged 15–44 whose first births were nonmarital was lower in the 2000s (36%) than in the previous 2 decades.”

That is shocking. How could we have a falling percentage of fathers not married at the time of their first births? The author, Gladys Martinez, writes:

Results from this study indicate that in the 2000s, the percentage of fathers with nonmarital first births declined. However, the percentage of fathers whose nonmarital first births occurred within a cohabiting union increased. This pattern differs from that for the mother. Data for women showed that the share of all births that occurred to unmarried women has doubled between 1988 and 2009–2013, and that the increase was driven by an increase in the share of births to cohabiting women.

Here is the main figure, showing the decline in nonmarital first births for fathers:

nchs-men-1But I think this is not correct (this concern was first raised to me by Pew researcher Gretchen Livingston). Here’s why. As the figure shows, the source for these three decades of data is the National Survey of Family Growth. The earliest this survey captured men’s births (awkward phrase, but you know what I mean) was in 2002. And the ages included in the survey were 15-44. But the figure has information about births in the years 1980-1989. By my math, the oldest a 15-44-year-old in 2002 could have been in 1989 is 31. So that 2002 survey is only returning data on the marital status of men ages 15-31 in the 1980s.

I always have to do one of these to make sure I’m not crazy when I’m trying to work something like this out. This is how old 15-44 year-olds in 2002 were in the 1980s, excluding those under 15 (click to enlarge):

age-in-80s

They’re all 15-31 (or younger) in the 1980s. In contrast, if they combine the 2006-2010 survey (collected over 5 years) with the 2011-2013 survey (collected over 3 years), they have men ages 15-42 in the 1990s and 15-44 in the 2000s. So, as the age of the men in the sample rose, the proportion married when they had their first birth rose, too. This is what we would expect: younger first-time parents are much less likely to be married.

Consider, then, the followup finding from the brief: for men of every age the proportion unmarried at the time of their first birth has increased:

nchs-men-2How can it be that the overall proportion unmarried is falling, while it’s rising for each age group? The answer in the data brief is that first-time unmarried fathers are getting older. But remember — the samples are getting older across these decades, because of the timing of the surveys: they age from 15-31 to 15-44. That explains the next figure perfectly. Look at that increase in the proportion of unmarried first-time fathers who are 25-44:

nchs-men-3In the 1980s, just 8% of first-time unmarried fathers were age 25-44, compared with a whopping 33% in the 2000s. But doesn’t it seem likely that you’ll have fewer men ages 25-44 in a group that only goes up to age 31, versus a group that goes all the way up to age 44?

This stuff gets confusing, but I’m pretty sure this is right. That is, wrong. I do not believe that there is a falling percentage of fathers having first births when they’re not married. What looked like a weird, complicated demographic problem — falling unmarried first-fatherhood along with rising unmarried first-motherhood — is probably an artifact of a weird, complicated problem in the analysis.

There is nothing in the data brief to suggest there was an adjustment for the changing age composition of the data for these decades, but maybe they did something I don’t understand. If not, I think NCHS should correct or retract this report.

1 Comment

Filed under Research reports

Social class divides the futures of high school students

There is new research from the National Center for Education Statistics (NCES), written up by Susan Dynarsky at the New York Times Upshot. The striking finding is that poor children in the top quartile on high school math scores have a 41% chance of finishing a BA degree by their late twenties — the same chance as children from the second-lowest quartile in math scores who are high-socioeconomic status (SES). Poor children from the third-highest quartile in high school math have graduation about equal to the worst-scoring children form the richest group. Here’s the figure:

upshot-math-ba

The headline on the figure is misleading, actually, since SES is not measured by wealth, but by a combination of parental education, occupation, and income. (Low here means the bottom quartile of SES, Middle is the 25th to 75th percentile, and High is 75th and up.)

One possible mechanism for the disparity in college completion rates is education expectations. Dynarsky mentions expectations measured in the sophomore year of high school, which was 2002 for this cohort. What she doesn’t mention is how much those expectations changed by senior year. Going to the NCES source for that data (here) I found this chart, which I annotated in red:

Print

Between sophomore and senior year, the percentage expecting to finish a BA degree or more decreased and the percentage expecting to go to two-year college increased, across SES levels. But the change was much greater for lower SES students. So the gap in expecting to go to two-year college between high- and low-SES students grew from 6 to 17 percentage points; that is, from 9% versus 3% in the sophomore year to 22% versus 6% in the senior year. That’s a big crushing of expectations that happened in the formative years at the end of high school.

6 Comments

Filed under Research reports