Post-summer reading list: The Family, gender, race, economics, gayborhoods, insecurity and overwhelmed

I was extremely fortunate to have a real vacation this summer — two whole weeks. I feel like half a European. In that time I read, almost read, or thought about reading, a number of things I might have blogged about if I’d been working instead of at the beach:


The Family: Diversity, Inequality, and Social Change

Yes, my own book came out. I never worked on one thing so much. I really hope you like it. Look for it at the Norton booth at the American Sociological Association meetings in San Francisco this week. Info on ordering exam copies here.

About that gender stall

The Council on Contemporary Families, on whose board I serve, published an online symposium titled, After a Puzzling Pause, the Gender Revolution Continues. It features work by the team of David Cotter, Joan Hermsen, and Reeve Vanneman on a rebound in gender attitudes; new research on sex (by Sharon Sassler) and divorce (by Christine Schwartz) in egalitarian marriages; and how overwork contributes to the gender gap (by Youngjoo Cha). For additional commentary, see this piece by Virginia Rutter at Girl w/ Pen!, and an important caution from Joanna Pepin (who finds no rebound in attitudes in the trends for high school students). If I had written a whole post about this I would have found a way to link to my essay on the gender stall in the NYTimes, too.

Gender and Piketty

How Gender Changes Piketty’s ‘Capital in the Twenty-First Century’.” A discussion hosted by The Nation blog between Kathleen Geier, Kate Bahn, Joelle Gamble, Zillah Eisenstein and Heather Boushey

Scientists strike back at Nicholas Wade

Geneticists decry book on race and evolution.” More than 100 scientists signed a letter to the New York Times disavowing Wade’s use of population genetics. This story quotes Sarah Tishkoff, whose work Wade specifically misrepresented (as I described in my review in Boston Review). The article in Science also includes Wade’s weak response, in which he repeats the claim, which I do not find credible, that their objections are “driven by politics, not science.” He repeats this no matter how scientific the objections to his work.

Here comes There Goes the Gayborhood?

Amin Ghaziani’s new book has gotten a lot of well-deserved attention in the last few weeks. Here’s one good article in the New Yorker.

Cut Adrift: Families in Insecure Times

Marianne Cooper’s book is out now. From the publisher: “Through poignant case studies, she reveals what families are concerned about, how they manage their anxiety, whose job it is to worry, and how social class shapes all of these dynamics, including what is even worth worrying about in the first place.” Cooper led the research for Sheryl Sandberg’s book Lean In, and the book is from her sociology dissertation.

Overwhelmed: Work, Love, And Play When No One Has The Time

Brigid Schulte, a Washington Post journalist, has written a really good book about gender, work, and family. (I was happy to listen to it during the drive to our vacation, because it helped me let go and ignore work more.) I’ll write a longer review, but let me just say here it is very well written and researched on the issues of time use, the household division of labor, and work-family policy and politics, featuring many of your favorite social scientists in this area. Well worth considering for an undergrad family course. (Also, helps explain why there are so many Europeans on American beaches.)


What a recovery looks like (with population growth by age)

If you don’t account for population growth, I don’t get what you’re saying with these employment numbers. I’ll show a simple example, but first a little rundown on Friday’s jobs report.

Here is how CNN Money played the jobs report:


What does it mean, this loss and gain of jobs, returning finally to where we started? Four paragraphs under that happy headline, CNN did points out:

Given population growth over the last four years, the economy still needs more jobs to truly return to a healthy place. How many more? A whopping 7 million, calculates Heidi Shierholz, an economist with the Economic Policy Institute.

So what does “Finally!” mean? The Wall Street Journal ran the headline, “Jobs Return to Peak, but Quality Lags.” On 538 it was, “Women returned to prerecession levels of employment in 2013. Men remain hundreds of thousands of jobs in the hole:”


The Center on Budget and Policy Priorities showed how much better the previous recoveries were:


That’s a good comparison. And CBPP mentioned population growth, too:

…payroll employment has finally topped its level at the start of the recession. Still, with essentially no net job growth since December 2007 but a growing working-age population, many more people today want to work but don’t have a job.

It’s not that no one mentions population growth, it’s that they still lead with the “top line” number. And they all have that horizontal line at the raw number of jobs when the recession started as the benchmark. I don’t know why.

Maybe in some crazy economics world the absolute number of jobs is what really matters for evaluating a recovery, and that explains the fixation on that horizontal line. From a social perspective what matters is the proportion of people with jobs. I could see the logic if you had a finite number of employers that never change, where you could literally count up the jobs at two points in time, and see who added and who subtracted from their payrolls (this is why retail chains report “same-store” trends, so the statistics aren’t confounded by the changing number of stores). But we have zillions of employers, constantly changing and moving around — largely in response to population changes. So that static image seems pointless.

In perspective

So here are some charts to put the recession and recovery in slightly better perspective. These all use the Bureau of Labor Statistics’ Current Population Survey from March 2003 to March 2013 (from IPUMS), the household survey used to track the labor force. I use ages 15 and older, and combine people in school (up to age 24) with those employed (not how most people do it, but a lot of people went to school, or stayed in school, because of the bad job market, and it doesn’t make sense to count them as not simply not employed). The survey excludes people in institutions, like prisons, and on-base military personnel.

To show the basic issue, here are the changes in the non-institutionalized population, age 15+, along with the number of them employed or in school — showing absolute changes relative to 2008, the peak employment year.


The 15+ population increased almost 12 million from 2008 to 2013. People employed or in school was not yet back to 2008 levels in March 2013. So a basic population adjustment is the least you can ask for (and we get that from the BLS with the employment-population ratio, which as of May was up less than one percent in the last 3.5 years, but it’s not the headline number).

What about age shifts? You don’t expect extreme age composition changes in 5 years, but there are different employment trends at different ages, so those affect how many employed people we are short. Here are the trends in work/school, by age and sex:


This makes it look like the 30-49s that are getting crushed. The 50+ community’s gains, however,are deceptive — their population is increasing. In fact, the population of people 30-49 declined 5% during this decade, while the population 50+ increased almost 30%. The younger people have increased their schooling rates, but their population has also grown. If you look at the employment/school rates, you see that among men, it is the younger groups that have done worst:


Women clearly are doing better (partly because in the 20-29 range they’re going to school more). It is amazing that employment rates didn’t fall at all over age 60. This could be because the population increase in that group is all in Baby Boomers just hitting their sixties, but I reckon it’s also people delaying retirement compensating for unemployment.

Now that we have age-specific work/school rates, and population changes, we can easily calculate how many people in each age group would have to be in work/school to get to the number implied by applying the peak-year 2008 rates to the population in each year. Sorry this one is so ugly: I made the last bar for each group pink to show the bottom line, where each group stands in 2013 relative to 2008:


Worst off are the 20-something men, down more than a million worker/students in 2013. Interestingly, women are only better off in the 20-something and 50+ ranges.

Finally, if you sum these figures you get the total, age-adjusted losses, by sex since 2008, as of March 2013:


And that’s your bottom line. The job/school losses stood at 3.3 million for men and 2.4 million for women as of March 2013.*

Really, there are no huge surprises here. In fact, the total population change is not a bad rough adjustment, especially for the short term. But there are some interesting nuances here. And with all the data and computers we have these days, let’s adjust for age and sex.

*I don’t say that’s how many “jobs” we need, because I don’t think “jobs” exist — are created, destroyed, shipped overseas, etc. I think there are employed people, people getting jobs, losing jobs, etc. I don’t see how a “job” exists absent a worker in it (and no, a listing is not a job until they fill it). So we don’t need to “create jobs” after a recession, what we need to do is “hire people.” Pet peeve.


Education, not income, drives Piketty searches

Proving once again that effort is not always correlated with income, I present this critique of a Justin Wolfers blog post…

A lot of people have written reviews of Piketty. The first few pages of a Google search revealed all these (I added Heather Boushey, who wrote a good one)*:


I believe that is diversity, because every human being is different.

Anyway, where to begin? Justin Wolfers wrote a little post, not a review, but it caught my attention. The headline of was, “Piketty’s Book on Wealth and Inequality Is More Popular in Richer States.” Distractable, that’s where I began.

Wolfers’ culminating line, “Vive la révolution!”, suited Scott Winship, who looked over Wolfer’s figures before sniping, “the buzz around the book has come mostly from rich liberal states along the Boston-to-Washington corridor.” But I think they’re both misinterpreting.

According to the Google search data Wolfers used, these were the top 10 states for “piketty” searches (Washington, D.C. excluded): Massachusetts, New York, Connecticut, Maryland, New Jersey, Illinois, Pennsylvania, Wisconsin, Oregon, California.

It looks to me that it’s actually education driving the search data. And that is a big difference. Let me explain.

Do data?

Microsoft Word tells me that the reading grade level of the publisher’s excerpt is 16.3, so it takes a 16th-grade education to read it. (Note that the “Boston-to-Washington corridor,” which was supposed to sound like a small sliver of the country, has 26% of the country’s college graduates.) So consider income versus college completion, which we can now take as a proxy for being able to read Piketty.

Wolfers writes, “I can’t tell you where Piketty has been least popular, because below a certain level of search activity, Google doesn’t release the actual numbers.” So he proceeds to leave 24 states out of his analysis (this will become important). Using per-capita income (converted to z-scores), and dropping 24 states plus the ridiculous outlier of DC, this is Wolfers’ income result (my calculations; he just showed scatter plots):


OK, leaving out the bottom half of the Piketty distribution, there is a strong positive relationship between per capita income and Piketty Google searches. Congratulations, you can have three jobs as an economist!

I kid Wolfers. But, come on! I don’t know what kind of data operation they’re running over there at the Upshot, but I would expect Wolfers to take it up a notch. First, control for college completion (percent of folks ages 25+ with a BA or more, also z-scored). See how it shows… oops:


The income effect is reduced but the education effect isn’t significant. (See how I showed you that instead of just going right to the results that support my argument?)

But go back to Wolfers leaving out the bottom half of the Piketty distribution. What’s wrong with that? I’m sure there’s some statistical way of explaining that, but just eyeballing it you’d have to say dropping those cases could cause trouble. The censored cases all have values of -.64 on the search variable. The relationship with income is weaker when the censored cases are included (shown in the red line) versus when he limits it to the top half of Piketty states (blue line):


What to do about this? An easy thing is just to include the censored cases at their values of -.64, just pretending -.64 is a legitimate value. That gives:


Now the income effect is reduced about three-quarters, and the college completion effect is three-times as large (with a t-stats to match).

But that’s not the best way to handle this. If only economists had invented a way of modeling data with censored dependent variables! Just kidding: there’s Tobin’s Tobit. This kind of model says, I see your censored dependent variable, and I crash it through the bottom of the distribution as a function of its linear relationship to your independent variables. So instead of all being -.64, it lets the censored cases be as low as they want to be, with values predicted by income and college completion. Sort of. Anyway, here’s that result:


Now income is crushed, reduced to literal insignificance. What matters is the percentage of the population that has completed college. It’s not that rich people like Piketty, it’s that college graduates do. Maybe because that’s who can read it. (I don’t know, I haven’t tried.)

What do economists read?

Of course, mine and Wolfers’ are both pretty crude analyses. There are only two reasons his was published on a major news site and mine was buried over here on an obscure sociology blog: (a) he writes for a major news site, and (b) his weak analysis lends itself to an emerging snarky narrative in which rich leftists are seen to whine about inequality but real people can’t be bothered (the main point of Winship’s review) — just reinforcing the echo-chamber model of knowledge consumption that people who are into “data-driven” news like to appear to have risen above.

For a real explanation, Wolfers (and Winship) need look no further than the rest of the Google Correlate results page to see the obvious fact that searches for Piketty are simply correlated with interest in economics. Here’s the search that is most highly correlated with searches for “piketty” across U.S. states: “world bank gdp” (r=.98):


Here are some other searches correlated with “piketty” at .94 or higher:

economic consulting firms
eu data protection
exchange rate data
gdp by sector
inflation target
journal of labor economics
london school economics
nber working paper
oecd statistics
oxford economics
panel data stata
stock market capitalization
the economist intelligence unit
us current account deficit
world bank statistics

Well, there goes your rich, liberal, “American left” theory of who’s driving the Piketty phenomenon. It might be true, but it’s not confirmed by the Google search data. My hot new theory: college educated people who are also interested in economics are disproportionately interested in Piketty.

* The reviewer pool: Mervyn King (The Telegraph), Paul Krugman (New York Review of Books), Tyler Cowen (Foreign Affairs), James K. Galbraith (Dissent), Daniel Schuchman (Wall Street Journal), Justin Fox (Harvard Business Review), Michael Tanner (National Review), John Cassidy (New Yorker), Martin Wolf (Financial Times), Jordan Weissmann (Slate), Steven Pearlstein (Washington Post), Scott Winship (National Review), Heather Boushey (Challenge)


How well do teen test scores predict adult income?

Now with new figures and notes added at the end — and a new, real life headline and graph illustrating the problem in the middle!

The short answer is, pretty well. But that’s not really the point.

In a previous post I complained about various ways of collapsing data before plotting it. Although this is useful at times, and inevitable to varying degrees, the main danger is the risk of inflating how strong an effect seems. So that’s the point about teen test scores and adult income.

If someone told you that the test scores people get in their late teens were highly correlated with their incomes later in life, you probably wouldn’t be surprised. If I said the correlation was .35, on a scale of 0 to 1, that would seem like a strong relationship. And it is. That’s what I got using the National Longitudinal Survey of Youth. I compared the Armed Forces Qualifying Test scores, taken in 1999, when the respondents were ages 15-19 with their household income in 2011, when they were 27-31.*

Here is the linear fit between between these two measures, with the 95% confidence interval shaded, showing just how confident we can be in this incredibly strong relationship:


That’s definitely enough for a screaming headline, “How your kids’ test scores tell you whether they will be rich or poor.”

In fact, since I originally wrote this, the Washington Post Wonkblog published a post with the headline, “Here’s how much your high school grades predict your future salary,” with this incredibly tidy graph:


No doubt these are strong relationships. My correlation of .35 means AFQT explains 12% of the variation in household income. But take heart, ye parents in the age of uncertainty: 12% of the variation leaves a lot left over. This variable can’t account for how creative your children are, how sociable, how attractive, how driven, how entitled, how connected, or how White they may be. To get a sense of all the other things that matter, here is the same data, with the same regression line, but now with all 5,248 individual points plotted as well (which means we have to rescale the y-axis):


Each dot is a person’s life — or two aspects of it, anyway — with the virtually infinite sources of variability that make up the wonder of social existence. All of a sudden that strong relationship doesn’t feel like something you can bank on with any given individual. Yes, there are very few people from the bottom of the test-score distribution who are now in the richest households (those clipped by the survey’s topcode and pegged at 3 on my scale), and hardly anyone from the top of the test-score distribution who is now completely broke.

But I would guess that for most kids a better predictor of future income would be spending an hour interviewing their parents and high school teachers, or spending a day getting to know them as a teenager. But that’s just a guess (and that’s an inefficient way to capture large-scale patterns).

I’m not here to argue about how much various measures matter for future income, or whether there is such a thing as general intelligence, or how heritable it is (my opinion is that a test such as this, at this age, measures what people have learned much more than a disposition toward learning inherent at birth). I just want to give a visual example of how even a very strong relationship in social science usually represents a very messy reality.

Post-publication addendums

1. Prediction intervals

I probably first wrote about this difference between the slope and the variation around the slope two years ago, in a futile argument against the use of second-person headlines such as “Homophobic? Maybe You’re Gay.” Those headlines always try to turn research into personal advice, and are almost always wrong.

Carter Butts, in personal correspondence, offered an explanation that helps make this clear. The “you” type headline presents a situation in which you — the reader — are offered the chance to add yourself to the study. In that case, your outcome (the “new response” in his note) is determined by the both the line and the variation around the line. Carter writes:

the prediction interval for a new response has to take into account not only the (predicted) expectation, but also the (predicted) variation around that expectation. A typical example is attached; I generated simulated data (N=1000) via the indicated formula, and then just regressed y on x. As you’d expect, the confidence bands (red) are quite narrow, but the prediction bands (green) are large – in the true model, they would have a total width of approximately 1, and the estimated model is quite close to that. Your post nicely illustrated that the precision with which we can estimate a mean effect is not equivalent to the variation accounted for by that mean effect; a complementary observation is that the precision with which we can estimate a mean effect is not equivalent to the accuracy with which we can predict a new observation. Nothing deep about that … just the practical points that (1) when people are looking at an interval, they need to be wary of whether it is a confidence interval or a prediction interval; and (2) prediction interval can (and often should be) wide, even if the model is “good” in the sense of being well-estimated.

And here is his figure. “You” are very likely to be between the green lines, but not so likely to be between the red ones.


2. Random other variables

I didn’t get into the substantive issues, which are outside my expertise. However, one suggestion I got was interesting: What about happiness? Without endorsing the concept of “life satisfaction” as measured by a single question, I still think this is a nice addition because it underscores the point of wide variation in how this relationship between test scores and income might be experienced.

So here is the same figure, but with the individuals coded according to how they answered the following question in 2008, when they were age 24-28, “All things considered, how satisfied are you with your life as a whole these days? Please give me an answer from 1 to 10, where 1 means extremely dissatisfied and 10 means extremely satisfied.” In the figure, Blue is least satisfied (1-6; 21%), Orange is moderately satisfied (7-8; 46%), and Green is most satisfied (9-10; 32%)


Even if you squint you probably can’t discern the pattern. Life satisfaction is positively correlated with income at .16, and less so with test scores (.07). Again, significant correlation — not helpful for planning your life.

* I actually used something similar to AFQT: the variable ASVAB, which combines tests of mathematical knowledge, arithmetic reasoning, word knowledge, and paragraph comprehension, and scales them from 0 to 100. For household income, I used a measure of household income relative to the poverty line (adjusted for household size), plus one, and transformed by natural log. I used household income because some good test-takers might marry someone with a high income, or have fewer people in their households — good decisions if your goal is maximizing household income per person.


How to illustrate a .61 relationship with a .93 figure: Chetty and Wilcox edition

Yesterday I wondered about the treatment of race in the blockbuster Chetty et al. paper on economic mobility trends and variation. Today, graphics and representation.

If you read Brad Wilcox’s triumphalist Slate post, “Family Matters” (as if he needed “an important new Harvard study” to write that), you saw this figure:


David Leonhardt tweeted that figure as “A reminder, via [Wilcox], of how important marriage is for social mobility.” But what does the figure show? Neither said anything more than what is printed on the figure. Of course, the figure is not the analysis. But it is what a lot of people remember about the analysis.

But the analysis on which it is based uses 741 commuting zones (metropolitan or rural areas defined by commuting patterns). So what are those 20 dots lying so perfectly along that line? In fact, that correlation printed on the graph, -.764, is much weaker than what you see plotted on the graph. The relationship you’re looking at is -.93! (thanks Bill Bielby for pointing that out).

In the paper, which presumably few of the people tweeting about it read, the authors explain that these figures are “binned scatter plots.” They broke the commuting zones into equally-sized groups and plotted the means of the x and y variables. They say they did percentiles, which would be 100 dots, but this one only has 20 dots, so let’s call them vigintiles.

In the process of analysis, this might be a reasonable way to eyeball a relationship and look for nonlinearities. But for presentation it’s wrong wrong wrong.* The dots compress the variation, and the line compresses it more. The dots give the misleading impression that you’re displaying the variance around the line. What, are you trying save ink?

Since the data are available, we can look at this for realz. Here is the relationship with all the points, showing a much messier relationship, the actual -.76 (the range of the Chetty et al. figure, which was compressed by the binning, is shown by the blue box):

chetty scattersThat’s 709 dots — one for each of the commuting zones for which they had sufficient data. With today’s powerful computers and high resolution screens, there is no excuse for reducing this down to 20 dots for display purposes.

But wait, there’s more. What about population differences? In the 2000 Census, these 709 commuting zones ranged in population in the 2000 Census from 5,000 (Southwest Jackson, Utah) to 16,000,000 (Los Angeles). Do you want to count Southwest Jackson as much as Los Angeles in your analysis of the relationship between these variables? Chetty et al. do in their figure. But if you weight them by population size, so each person in the population contributes equally to the relationship, that correlation that was -.76 — which they displayed as -.93 — is reduced to -.61. Yikes.

Here is what the plot looks like if you scale the commuting zones according to population size (more or less, not quite sure how Stata does this):

chetty scatters weighted

Now it’s messier, and the slope is much less steep. And you can see that gargantuan outlier — which turns out to be the New York commuting zone, which has 12 million people and with a lot more upward mobility than you would expect based on its family structure composition.

Finally, while we’re at it, we may as well attend to that nonlinearity that has been apparent since the opening figure. We can increase the variance explained from .38 to .42 by adding a quadratic term, to get this:

chetty scatters weighted quad

I hate to go beyond what the data can really tell. But — what the heck — it does appear that after 33% single-mother families, the effect hits its minimum and turns positive. These single mother figures are pretty old (when Chetty et al.’s sample were kids). Now that the country has surpassed 40% unmarried births, I think it’s safe to say we’re out of the woods. But that’s just speculation.**

*OK, OK: “wrong wrong wrong” is going too far. Absolute rules in data visualization are often wrong wrong wrong. Binning 709 groups down to 20 is extreme. Sometimes you have a zillion points. Sometimes the plot obscures the pattern. Sometimes binning is an inherent part of measurement (we usually measure age in years, for example, not seconds). None of that is an excuse in this case. However, Carter Butts sent along an example that makes the point well:


On the other hand, the Chetty et al. case is more similar to the following extreme example:

If you were interested in the relationship between age and earnings for a sample of 1,400 full-time, year-round women, you might start with this, which is a little frustrating:


The linear relationship is hard to see, but it’s about +$500 per year of age. However, the correlation is only .13, and the variance explained by linear-age alone is only 1.7%. But if you plotted the mean wage over ages, the correlation jumps to .68:


That’s a different question. It’s not, “how does age affect earnings,” it’s, “how does age affect mean earnings.” And if you binned the women into 10-year age intervals (25-34, 35-44, 45-54), and plotted the mean wage for each group, the correlation is .86.


Chetty et al. didn’t report the final correlation, but they showed it, even adding the regression line, so that Wilcox could call it the “bivariate relationship.”

**This paragraph was a joke that several people missed, so I’m clarifying. I would never draw a conclusion like that from the scraggly tale of a loose correlation like this.


Where is race in the Chetty et al. mobility paper?

What does race have to do with mobility? The words “race,” “black,” or “African American” don’t appear in David Leonhardt’s report on the new Chetty et al. paper on intergenerational mobility that hit the news yesterday. Or in Jim Tankersley’s report in the Washington Post, which is amazing, because it included this figure: post-race-mobility That’s not exactly a map of Black America, which the Census Bureau has produced, but it’s not that far off: census-black-2010

But even if you don’t look at the map, what if you read the paper? Describing the series of maps of intergenerational mobility, the authors write:

Perhaps the most obvious pattern from the maps in Figure VI is that intergenerational mobility is lower in areas with larger African-American populations, such as the Southeast. … Figure IXa confirms that areas with larger African-American populations do in fact have substantially lower rates of upward mobility. The correlation between upward mobility and fraction black is -0.585. In areas that have small black populations, children born to parents at the 25th percentile can expect to reach the median of the national income distribution on average (y25;c = 50); in areas with
large African-American populations, y25;c is only 35.

Here is that Figure IXa, which plots Black population composition and mobility levels for groups of commuting zones: ixa Yes, race is an important part of the story. In a nice part of the paper, the authors test whether Black population size is related to upward mobility for Whites (or, people in zip codes that are probably White, since race isn’t in their tax records), and find that it is. It’s not just Blacks driving the effect. I’m thinking about the historical patterns of industrial development, land ownership, the backwardness of racist elites in the South, and so on. But they’re not. For some reason, not explained at all, Chetty et al. offer this pivot:

The main lesson of the analysis in this section is that both blacks and whites living in areas with large African-American populations have lower rates of upward income mobility. One potential mechanism for this pattern is the historical legacy of greater segregation in areas with more blacks. Such segregation could potentially affect both low-income whites and blacks, as racial segregation is often associated with income segregation. We turn to the relationship between segregation and upward mobility in the next section.

And that’s it, they don’t discuss Black population size again, instead only focusing on racial segregation. They don’t pursue this “potential mechanism” in the analysis that follows. Instead, they drop percent Black for racial segregation. I have no idea why, especially considering this Table VII, which shows unadjusted (and normalized) correlations (more or less) between each variable and absolute upward mobility (the variable mapped above): tablevii

In these normalized correlations, fraction Black has a stronger relationship to mobility than racial segregation or economic segregation! In fact, it’s just about the strongest relationship on the whole long table (except for single mothers, with which it is of course highly correlated). So why do they not use it in their main models? Maybe someone else can explain this to me. (Full disclosure, my whole dissertation was about this variable.)

This is especially unfortunate because they do an analysis of the association between commuting zone family structure (using macro-level variables) and individual-level mobility, controlling for marital status — but not race — at the individual level. From this they conclude, “Children of married parents also have higher rates of upward mobility if they live in communities with fewer single parents.” I am quite suspicious that this effect is inflated by the omission of race at either level. So they write the following, which goes way beyond what they can find in the data:

Hence, family structure correlates with upward mobility not just at the individual level but also at the community level, perhaps because the stability of the social environment affects children’s outcomes more broadly.

Or maybe, race.

I explored the percent Black versus single mother question in a post a few weeks ago using the Chetty et al. data. I did two very simple OLS regression models using only the 100 largest commuting zones, weighted for population size, the first with just single motherhood, and then a model with proportion Black added: This shows that the association between single motherhood rates and immobility is reduced by two-thirds, and is no longer significant at conventional levels, when percent Black is added to the model. That is: Percent Black statistically explains the relationship between single motherhood and intergenerational immobility across U.S. labor markets. That’s not an analysis, it’s just an argument for keeping percent Black in the more complex models. Substantively, the level of racial segregation is just one part of the complex race story — it measures one kind of inequality in a local area, but not the amount of Black, which matters a lot (I won’t go into it all, but here are three old papers: one, two, three.

The burgeoning elite conversation about economic mobility, poverty, and inequality is good news. It’s avoidance of race is not.


Sociology citing Becker

Which comes first, the Nobel prize or the citations in sociology journals?

Neal Caren produced a list of the 52 works most cited in sociology journals in 2013, which included two Nobel prize winning economists:

  • Heckman, James J. “Sample selection bias as a specification error.” Econometrica: Journal of the econometric society (1979): 153-161.
  • Gary S. Becker. A Treatise on the Family. Harvard university press, 1981.

I assume those Heckman citations are the result of sadistic journal reviewers or dissertation committee members impressive their colleagues by requiring people to add selection corrections to their regressions.

The Becker citations were applauded by economists. I assumed they were usually cursory mentions in the literature review, representing neoclassical economics in the study of families. And that is basically right. In the 10 most recent citations to Treatise in top-three sociology journals, the book is always mentioned only once. See for yourself. Here are the passages out of context (citations at the end):

  1. A great deal of work in sociological theory addresses the determinants of marriage and the bases of divorce. Some of this work posits marriage as a form of social exchange, whereby internal benefits (sex) and costs (time) are calculated and weighed relative to external costs (money) and benefits (social approval) (Becker 1991).

  2. According to the negotiation framework known as intra-household bargaining (Agarwal 1997), rather than households behaving as cohesive units (Becker 1991), household members’ bargaining and decision making over the allocation of resources (e.g., income, health, education, time use) are conditioned by gender-based power differentials.

  3. In the classic economic and game theoretic models of partner matching and mate selection (Becker 1991; Gale and Shapley 1962), the relative value of every potential mate is assumed to be already known or can easily be determined (Todd and Miller 1999).

  4. Generally used to explain behavior during the waking hours, the time availability perspective suggests that because men spend more time in paid work, they have less time to do caregiving; the related specialization hypothesis suggests that women have the time and incentive to specialize in caregiving and unpaid work (Becker 1991[1981]).

  5. A second means by which household wealth is accrued is by means of family transfers. Economic assets, whether financial or real, are transferred from family members to others, both within and across generations (Becker 1991; Mulligan 1997; Wahl 2002).

  6. The compensating differentials argument suggests that mothers are more willing than non-mothers to trade wages for family-friendly employment. For example, Becker (1991) suggests that mothers may choose jobs that require less energy or that have parent-friendly characteristics, such as flexible hours, few demands for travel or nonstandard shifts, or on-site daycare.

  7. Differences in life course patterns between men and women may reflect the influences of traditional gender roles in the family and corresponding intermittent labor force attachment among women relative to men, particularly during childbearing years (Becker 1991; Bianchi 1995; Mincer and Polachek 1974).

  8. One of the primary ways in which education leads to lower fertility is by changing the calculation of the costs and benefits of childbearing and rearing (Becker 1991).

  9. As has long been recognized in both economics and sociology, an adequate explanation of gender inequality in the labor force therefore requires the researcher to go beyond discrimination and productivity-related attributes (i.e., human capital) and to consider the role of the family (Becker 1973, 1974, 1991; Mincer and Polachek 1974; many others). … First, it is assumed that economic resources are a family-level utility that is shared equally between the spouses (Becker 1973, 1974, 1991; Lundberg and Pollak 1993; Mincer and Polachek 1974).

  10. Fathers’ economic contributions are an important resource for children in all types of families (Becker 1991; Coleman 1988).

I noticed, incidentally, that we may have hit Peak Becker. The Web of Science citation count for his work in journals coded as Sociology peaked in 2011. Maybe the 2012 data just aren’t complete yet.


Out of curiosity, I also checked the citations in major economics journals to the most highly-cited sociology article on the household division of labor known for a theoretical argument, Julie Brines’s 1994 article in the American Journal of Sociology. Just kidding; there aren’t any.

No, that’s not true. The article has been cited once in the top 40 economics journals, in Transportation Research Part A: Policy and Practice:

The higher wage earner enjoys a superior bargaining position, and thus can use that power to demand less household responsibility – a proposition that has been the focus of substantial empirical research among sociologists (Heer, 1963, Brines, 1994, Greenstein, 2000, Bittman et al., 2003, Parkman, 2004 and Gupta, 2007).


  1. Rose McDermott. and James H. Fowler. and Nicholas A. Christakis. “Breaking Up Is Hard to Do, Unless Everyone Else Is Doing It Too: Social Network Effects on Divorce in a Longitudinal Sample.” Social Forces 92.2 (2013): 491-519.
  2. Greta Friedemann-Sánchez. and Rodrigo Lovatón. “Intimate Partner Violence in Colombia: Who Is at Risk?” Social Forces 91.2 (2012): 663-688
  3. Michael J. Rosenfeld and Reuben J. Thomas. 2012. Searching for a Mate: The Rise of the Internet as a Social Intermediary. American Sociological Review August 2012 77: 523-547. doi:10.1177/0003122412448050
  4. Sarah A. Burgard. “The Needs of Others: Gender and Sleep Interruptions for Caregivers.” Social Forces 89.4 (2011): 1189-1215.
  5. Moshe Semyonov. and Noah Lewin-Epstein. “Wealth Inequality: Ethnic Disparities in Israeli Society.” Social Forces 89.3 (2011): 935-959.
  6. Michelle J. Budig and Melissa J. Hodges. 2010. Differences in Disadvantage: Variation in the Motherhood Penalty across White Women’s Earnings Distribution. American Sociological Review October 2010 75: 705-728, doi:10.1177/0003122410381593.
  7. Jennie E. Brand and Yu Xie. 2010. Who Benefits Most from College?: Evidence for Negative Selection in Heterogeneous Economic Returns to Higher Education. American Sociological Review April 2010 75: 273-302, doi:10.1177/0003122410363567.
  8. Brienna Perelli-Harris. “Family Formation in Post-Soviet Ukraine: Changing Effects of Education in a Period of Rapid Social Change.” Social Forces 87.2 (2008): 767-794.
  9. Emily Greenman. and Yu Xie. “Double Jeopardy?: The Interaction of Gender and Race on Earnings in the United States.” Social Forces 86.3 (2008): 1217-1244.
  10. Daniel N. Hawkins, Paul R. Amato, and Valarie King. 2007. Nonresident Father Involvement and Adolescent Well-Being: Father Effects or Child Effects? American Sociological Review December 72: 990-1010, doi:10.1177/000312240707200607.
  11. Sirui Liu, Pamela Murray-Tuite, Lisa Schweitzer. 2012. Analysis of child pick-up during daily routines and for daytime no-notice evacuations, Transportation Research Part A: Policy and Practice, Volume 46, Issue 1, Pages 48-67.


