Tag Archives: inequality

Sex ratios as if not everyone is a college graduate

Quick: What percentage of 22-to-29-year-old, never-married Americans are college graduates? Not sure? Just look around at your friends and colleagues.

Actually, unlike among your friends and colleagues, the figure is only 27.5% (as of 2010). Yep, barely more than a quarter of singles in their 20s have finished college. Or, as the headlines for the last few days would have it: basically everyone.

The tweeted version of this Washington Post Wonkblog story was, “Why dating in America is completely unfair,” and the figure was titled “Best U.S. cities for dating” (subtitle: “based on college graduates ages 22-29”). This local news version listed “best U.S. cities for dating,” but never even said they were talking about college graduates only. The empirical point is simple: there are more women than men among young college graduates, so those women have a small pool to choose from, so we presume it’s hard for them to date.* (Also, in these stories everyone is straight.) In his Washington Post excerpt the author behind this, Jon Birger, talks all about college women. The headline is, “Hookup culture isn’t the real problem facing singles today. It’s math.” You have to get to the sixth paragraph before you find out that singles means college and post-college women.

In his Post interview the subject of less educated people did come up briefly — if they’re men:

Q: Some of these descriptions make it sound like the social progress and education that women have obtained has been a lose-lose situation: In the past women weren’t able to get college educations, today they can, but now they’re losing in this other realm [dating]. Is it implying that less educated men are still winning – they don’t go to college but they still get the pick of all these educated, more promiscuous women?

A: Actually, it’s the opposite. Less educated men are actually facing as challenging a dating and marriage market as the educated women. So for example, among non-college educated men in the U.S. age 22 to 29, there are 9.4 million single men versus 7.1 million single women. So the lesser-educated men face an extremely challenging data market. They do not have it easy at all.

It’s almost as if the non-college-educated woman is inconceivable. She’s certainly invisible. The people having trouble finding dates are college-educated women and non-college-educated men. By this simple sex-ratio logic, it should be raining men for the non-college women. Too bad no one thought to think of them.

Yes, the education-specific sex ratio is much better for women who haven’t been to college. That is, they are outnumbered by non-college men. But it’s not working out that well for them in mating-market terms.

I can’t show dating patterns with Census data (and neither can Birger), but I can show first-marriage rates — that is, the rate at which never-married people get married. Here are the education-specific sex ratios, and first-marriage rates, for 18-34-year-old never-married women in 279 metropolitan areas, from the 2009-2011 American Community Survey.** Blue circles for women with high school education or less, orange for BA-holders (click to enlarge):


Note that for both groups marriage rates are lower for women when there are more of them relative to men — the downward sloping lines (which are weighted by population size). Fewer men for women to choose from, plus men eschew marriage when they’re surrounded by desperate women, so lower marriage rates for women. But wait: the sex ratios are so much better for non-college women — they are outnumbered by male peers in almost every market, and usually by a lot. Yet their marriage rates are still much lower than the college graduates’. Who cares?

I don’t have time to get into the reasons for this pattern; this post is media commentary more than social analysis. But let’s just agree to remember that non-college-educated women exist, and acknowledge that the marriage market is even more unfair for them. Imagine that.***

* I once argued that this could help explain why gender segregation has dropped so much faster for college graduates.

** It was 296 metro areas but I dropped the extreme ones: over 70% female and marriage rates over 0.3.

*** Remember, if we want to use marriage to solver poverty for poor single mothers, we have enough rich single men to go around, as I showed.

A little code:

I generated the figure using Stata. I got the data through a series of clunky Windows steps that aren’t easily shared, but here at least is the code for making a graph with two sets of weighted circles, each with its own weighted linear fit line, in case it helps you:

twoway (scatter Y1 X1 [w=count1], mc(none) mlc(blue) mlwidth(vthin)) ///

(scatter Y2 X2 [w=count2], mc(none) mlc(orange_red) mlwidth(vthin)) ///

(lfit Y1 X1 [w=count1], lc(blue)) ///

(lfit Y2 X2 [w=count2], lc(orange_red)) , ///

xlabel(30(10)70) ylabel(0(.1).3)


Filed under In the news

How (and how much) academics talk about inequality, in one chart

Reader advisory: When I say “in one chart,” I never really mean it.

Updated with new chart at the end.

Because someone asked, here is the article count from Web of Science (an academic journal database with emphasis on science), showing the frequency of articles (of all types) according to the inequality-related phrases in their titles. This is obviously not an exhaustive list of work on these subjects, but I did want to show all combinations of race, class, and gender (click to enlarge).

strat terms.xlsx

  • “Social inequality” now completely dominates, but it once was second to “social stratification.”
  • The most common of the three-word combinations is “race, class, and gender.”
  • “Gender, race, and class” has almost always been second.
  • “Gender, class, and race” made a run in the late 1990s, but has since faded.

I’ve written a little more about language and intersectional concerns here.


Don Tomaskovic-Devey sent along this figure, which shows newspaper articles using inequality related terms. The dotted line shows articles with rich, wealthy, top 1%, top one %, while the solid line shows income inequality. He suggests the dotted line may reflect an Occupy Wall Street effect, while the solid line shows the Thomas Piketty framing process:




Filed under Research reports

Global inequality, within and between countries

Most of the talk about income inequality is about inequality within countries – between rich and poor Americans, versus between rich and poor Swedes, for example. The new special issue of Science magazine about inequality focuses that way as well, for example with this nice figure showing inequality within countries around the world.

But what if there were no income inequality within countries? If everyone within each country had the same income, but we still had rich and poor countries, how unequal would our world be? It turns out that’s an easy question to answer.

Using data from the World Bank on income for 131 countries, comprising 91% of the world population, here is the Lorenz curve showing the distribution of gross national income (GNI) by population, with each person in each country assumed to have the same income (using the purchasing power parity currency conversion). I’ve marked the place of the three largest countries: China, India, and the USA:


The Gini index value for this distribution is .48, which means the area between the Lorenz curve and the blue line – representing equality, is 48% of the lower-right triangle. (Going all the way to 1.0 would mean one person had all the money.)

But there is inequality within countries. In that Science figure the within-country Ginis range from .24 in Belarus to .67 in South Africa. (And that’s using after-tax household income, which assumes each person within each household has the same income. So there’s that, too.)

The World Bank data I’m using includes within-country income distributions broken into 7 quantiles: 5 quintiles (20% of the population each), with the top and bottom further broken in half. If I assume that the income is shared equally within each of these quantiles, I can take those 131 countries and turn them into 917 quantiles (just assigning each group its share of the country’s GNI). These groups range in average income from $0 (due to rounding) in the bottom 10th of Bolivia and Guyana, or $43 per person in the bottom 10th of the Democratic Rep. of Congo, up to $305,800 per person in the top 10th of Macao.

To illustrate this, here are India, China, and the USA, showing average incomes for the quantiles and the countries as a whole:


This shows that the average income of China’s top 10th is between the second and third quntiles of the US income distribution, and the top 10th of India has an average income comparable to the US 10-19th percentile range. Obviously, this breakdown shows a lot more inequality.

So here I add the new Lorenz curve to the first figure, counting each of those 917 quantiles as a separate group with its own income:


Now the Gini index has risen a neat 25%, to an even .60. Is that a big difference? Clearly, between country inequality — the red line — is vast. If every country were a household, the world would be almost as unequal as Nigeria. In this comparison, you could say you get 80% of the income inequality to show up just looking at whole countries. But of course even that obscures much more, especially at the high end, where there is no limit.

Years ago I followed the academic debate over how to measure inequality within and between countries. If I were to catch up with it again, I would start with this article, by my friends Tim Moran and Patricio Korzeniewicz. That provoked a debate over methods and theory, and they eventually published this book, which argues: “within-country analyses alone have not adequately illuminated our understanding of global stratification.” There is a lot more to read, but their work, and the critiques they’re received, is a good place to start.

Note: I have put my Excel worksheet for this post here. It has the original data and my calculations, but not the figures.


Filed under Uncategorized

Education, not income, drives Piketty searches

Proving once again that effort is not always correlated with income, I present this critique of a Justin Wolfers blog post…

A lot of people have written reviews of Piketty. The first few pages of a Google search revealed all these (I added Heather Boushey, who wrote a good one)*:


I believe that is diversity, because every human being is different.

Anyway, where to begin? Justin Wolfers wrote a little post, not a review, but it caught my attention. The headline of was, “Piketty’s Book on Wealth and Inequality Is More Popular in Richer States.” Distractable, that’s where I began.

Wolfers’ culminating line, “Vive la révolution!”, suited Scott Winship, who looked over Wolfer’s figures before sniping, “the buzz around the book has come mostly from rich liberal states along the Boston-to-Washington corridor.” But I think they’re both misinterpreting.

According to the Google search data Wolfers used, these were the top 10 states for “piketty” searches (Washington, D.C. excluded): Massachusetts, New York, Connecticut, Maryland, New Jersey, Illinois, Pennsylvania, Wisconsin, Oregon, California.

It looks to me that it’s actually education driving the search data. And that is a big difference. Let me explain.

Do data?

Microsoft Word tells me that the reading grade level of the publisher’s excerpt is 16.3, so it takes a 16th-grade education to read it. (Note that the “Boston-to-Washington corridor,” which was supposed to sound like a small sliver of the country, has 26% of the country’s college graduates.) So consider income versus college completion, which we can now take as a proxy for being able to read Piketty.

Wolfers writes, “I can’t tell you where Piketty has been least popular, because below a certain level of search activity, Google doesn’t release the actual numbers.” So he proceeds to leave 24 states out of his analysis (this will become important). Using per-capita income (converted to z-scores), and dropping 24 states plus the ridiculous outlier of DC, this is Wolfers’ income result (my calculations; he just showed scatter plots):


OK, leaving out the bottom half of the Piketty distribution, there is a strong positive relationship between per capita income and Piketty Google searches. Congratulations, you can have three jobs as an economist!

I kid Wolfers. But, come on! I don’t know what kind of data operation they’re running over there at the Upshot, but I would expect Wolfers to take it up a notch. First, control for college completion (percent of folks ages 25+ with a BA or more, also z-scored). See how it shows… oops:


The income effect is reduced but the education effect isn’t significant. (See how I showed you that instead of just going right to the results that support my argument?)

But go back to Wolfers leaving out the bottom half of the Piketty distribution. What’s wrong with that? I’m sure there’s some statistical way of explaining that, but just eyeballing it you’d have to say dropping those cases could cause trouble. The censored cases all have values of -.64 on the search variable. The relationship with income is weaker when the censored cases are included (shown in the red line) versus when he limits it to the top half of Piketty states (blue line):


What to do about this? An easy thing is just to include the censored cases at their values of -.64, just pretending -.64 is a legitimate value. That gives:


Now the income effect is reduced about three-quarters, and the college completion effect is three-times as large (with a t-stats to match).

But that’s not the best way to handle this. If only economists had invented a way of modeling data with censored dependent variables! Just kidding: there’s Tobin’s Tobit. This kind of model says, I see your censored dependent variable, and I crash it through the bottom of the distribution as a function of its linear relationship to your independent variables. So instead of all being -.64, it lets the censored cases be as low as they want to be, with values predicted by income and college completion. Sort of. Anyway, here’s that result:


Now income is crushed, reduced to literal insignificance. What matters is the percentage of the population that has completed college. It’s not that rich people like Piketty, it’s that college graduates do. Maybe because that’s who can read it. (I don’t know, I haven’t tried.)

What do economists read?

Of course, mine and Wolfers’ are both pretty crude analyses. There are only two reasons his was published on a major news site and mine was buried over here on an obscure sociology blog: (a) he writes for a major news site, and (b) his weak analysis lends itself to an emerging snarky narrative in which rich leftists are seen to whine about inequality but real people can’t be bothered (the main point of Winship’s review) — just reinforcing the echo-chamber model of knowledge consumption that people who are into “data-driven” news like to appear to have risen above.

For a real explanation, Wolfers (and Winship) need look no further than the rest of the Google Correlate results page to see the obvious fact that searches for Piketty are simply correlated with interest in economics. Here’s the search that is most highly correlated with searches for “piketty” across U.S. states: “world bank gdp” (r=.98):


Here are some other searches correlated with “piketty” at .94 or higher:

economic consulting firms
eu data protection
exchange rate data
gdp by sector
inflation target
journal of labor economics
london school economics
nber working paper
oecd statistics
oxford economics
panel data stata
stock market capitalization
the economist intelligence unit
us current account deficit
world bank statistics

Well, there goes your rich, liberal, “American left” theory of who’s driving the Piketty phenomenon. It might be true, but it’s not confirmed by the Google search data. My hot new theory: college educated people who are also interested in economics are disproportionately interested in Piketty.

* The reviewer pool: Mervyn King (The Telegraph), Paul Krugman (New York Review of Books), Tyler Cowen (Foreign Affairs), James K. Galbraith (Dissent), Daniel Schuchman (Wall Street Journal), Justin Fox (Harvard Business Review), Michael Tanner (National Review), John Cassidy (New Yorker), Martin Wolf (Financial Times), Jordan Weissmann (Slate), Steven Pearlstein (Washington Post), Scott Winship (National Review), Heather Boushey (Challenge)


Filed under Uncategorized

How well do teen test scores predict adult income?

Now with new figures and notes added at the end — and a new, real life headline and graph illustrating the problem in the middle!

The short answer is, pretty well. But that’s not really the point.

In a previous post I complained about various ways of collapsing data before plotting it. Although this is useful at times, and inevitable to varying degrees, the main danger is the risk of inflating how strong an effect seems. So that’s the point about teen test scores and adult income.

If someone told you that the test scores people get in their late teens were highly correlated with their incomes later in life, you probably wouldn’t be surprised. If I said the correlation was .35, on a scale of 0 to 1, that would seem like a strong relationship. And it is. That’s what I got using the National Longitudinal Survey of Youth. I compared the Armed Forces Qualifying Test scores, taken in 1999, when the respondents were ages 15-19 with their household income in 2011, when they were 27-31.*

Here is the linear fit between between these two measures, with the 95% confidence interval shaded, showing just how confident we can be in this incredibly strong relationship:


That’s definitely enough for a screaming headline, “How your kids’ test scores tell you whether they will be rich or poor.”

In fact, since I originally wrote this, the Washington Post Wonkblog published a post with the headline, “Here’s how much your high school grades predict your future salary,” with this incredibly tidy graph:


No doubt these are strong relationships. My correlation of .35 means AFQT explains 12% of the variation in household income. But take heart, ye parents in the age of uncertainty: 12% of the variation leaves a lot left over. This variable can’t account for how creative your children are, how sociable, how attractive, how driven, how entitled, how connected, or how White they may be. To get a sense of all the other things that matter, here is the same data, with the same regression line, but now with all 5,248 individual points plotted as well (which means we have to rescale the y-axis):


Each dot is a person’s life — or two aspects of it, anyway — with the virtually infinite sources of variability that make up the wonder of social existence. All of a sudden that strong relationship doesn’t feel like something you can bank on with any given individual. Yes, there are very few people from the bottom of the test-score distribution who are now in the richest households (those clipped by the survey’s topcode and pegged at 3 on my scale), and hardly anyone from the top of the test-score distribution who is now completely broke.

But I would guess that for most kids a better predictor of future income would be spending an hour interviewing their parents and high school teachers, or spending a day getting to know them as a teenager. But that’s just a guess (and that’s an inefficient way to capture large-scale patterns).

I’m not here to argue about how much various measures matter for future income, or whether there is such a thing as general intelligence, or how heritable it is (my opinion is that a test such as this, at this age, measures what people have learned much more than a disposition toward learning inherent at birth). I just want to give a visual example of how even a very strong relationship in social science usually represents a very messy reality.

Post-publication addendums

1. Prediction intervals

I probably first wrote about this difference between the slope and the variation around the slope two years ago, in a futile argument against the use of second-person headlines such as “Homophobic? Maybe You’re Gay.” Those headlines always try to turn research into personal advice, and are almost always wrong.

Carter Butts, in personal correspondence, offered an explanation that helps make this clear. The “you” type headline presents a situation in which you — the reader — are offered the chance to add yourself to the study. In that case, your outcome (the “new response” in his note) is determined by the both the line and the variation around the line. Carter writes:

the prediction interval for a new response has to take into account not only the (predicted) expectation, but also the (predicted) variation around that expectation. A typical example is attached; I generated simulated data (N=1000) via the indicated formula, and then just regressed y on x. As you’d expect, the confidence bands (red) are quite narrow, but the prediction bands (green) are large – in the true model, they would have a total width of approximately 1, and the estimated model is quite close to that. Your post nicely illustrated that the precision with which we can estimate a mean effect is not equivalent to the variation accounted for by that mean effect; a complementary observation is that the precision with which we can estimate a mean effect is not equivalent to the accuracy with which we can predict a new observation. Nothing deep about that … just the practical points that (1) when people are looking at an interval, they need to be wary of whether it is a confidence interval or a prediction interval; and (2) prediction interval can (and often should be) wide, even if the model is “good” in the sense of being well-estimated.

And here is his figure. “You” are very likely to be between the green lines, but not so likely to be between the red ones.


2. Random other variables

I didn’t get into the substantive issues, which are outside my expertise. However, one suggestion I got was interesting: What about happiness? Without endorsing the concept of “life satisfaction” as measured by a single question, I still think this is a nice addition because it underscores the point of wide variation in how this relationship between test scores and income might be experienced.

So here is the same figure, but with the individuals coded according to how they answered the following question in 2008, when they were age 24-28, “All things considered, how satisfied are you with your life as a whole these days? Please give me an answer from 1 to 10, where 1 means extremely dissatisfied and 10 means extremely satisfied.” In the figure, Blue is least satisfied (1-6; 21%), Orange is moderately satisfied (7-8; 46%), and Green is most satisfied (9-10; 32%)


Even if you squint you probably can’t discern the pattern. Life satisfaction is positively correlated with income at .16, and less so with test scores (.07). Again, significant correlation — not helpful for planning your life.

* I actually used something similar to AFQT: the variable ASVAB, which combines tests of mathematical knowledge, arithmetic reasoning, word knowledge, and paragraph comprehension, and scales them from 0 to 100. For household income, I used a measure of household income relative to the poverty line (adjusted for household size), plus one, and transformed by natural log. I used household income because some good test-takers might marry someone with a high income, or have fewer people in their households — good decisions if your goal is maximizing household income per person.


Filed under Me @ work

How to illustrate a .61 relationship with a .93 figure: Chetty and Wilcox edition

Yesterday I wondered about the treatment of race in the blockbuster Chetty et al. paper on economic mobility trends and variation. Today, graphics and representation.

If you read Brad Wilcox’s triumphalist Slate post, “Family Matters” (as if he needed “an important new Harvard study” to write that), you saw this figure:


David Leonhardt tweeted that figure as “A reminder, via [Wilcox], of how important marriage is for social mobility.” But what does the figure show? Neither said anything more than what is printed on the figure. Of course, the figure is not the analysis. But it is what a lot of people remember about the analysis.

But the analysis on which it is based uses 741 commuting zones (metropolitan or rural areas defined by commuting patterns). So what are those 20 dots lying so perfectly along that line? In fact, that correlation printed on the graph, -.764, is much weaker than what you see plotted on the graph. The relationship you’re looking at is -.93! (thanks Bill Bielby for pointing that out).

In the paper, which presumably few of the people tweeting about it read, the authors explain that these figures are “binned scatter plots.” They broke the commuting zones into equally-sized groups and plotted the means of the x and y variables. They say they did percentiles, which would be 100 dots, but this one only has 20 dots, so let’s call them vigintiles.

In the process of analysis, this might be a reasonable way to eyeball a relationship and look for nonlinearities. But for presentation it’s wrong wrong wrong.* The dots compress the variation, and the line compresses it more. The dots give the misleading impression that you’re displaying the variance around the line. What, are you trying save ink?

Since the data are available, we can look at this for realz. Here is the relationship with all the points, showing a much messier relationship, the actual -.76 (the range of the Chetty et al. figure, which was compressed by the binning, is shown by the blue box):

chetty scattersThat’s 709 dots — one for each of the commuting zones for which they had sufficient data. With today’s powerful computers and high resolution screens, there is no excuse for reducing this down to 20 dots for display purposes.

But wait, there’s more. What about population differences? In the 2000 Census, these 709 commuting zones ranged in population in the 2000 Census from 5,000 (Southwest Jackson, Utah) to 16,000,000 (Los Angeles). Do you want to count Southwest Jackson as much as Los Angeles in your analysis of the relationship between these variables? Chetty et al. do in their figure. But if you weight them by population size, so each person in the population contributes equally to the relationship, that correlation that was -.76 — which they displayed as -.93 — is reduced to -.61. Yikes.

Here is what the plot looks like if you scale the commuting zones according to population size (more or less, not quite sure how Stata does this):

chetty scatters weighted

Now it’s messier, and the slope is much less steep. And you can see that gargantuan outlier — which turns out to be the New York commuting zone, which has 12 million people and with a lot more upward mobility than you would expect based on its family structure composition.

Finally, while we’re at it, we may as well attend to that nonlinearity that has been apparent since the opening figure. We can increase the variance explained from .38 to .42 by adding a quadratic term, to get this:

chetty scatters weighted quad

I hate to go beyond what the data can really tell. But — what the heck — it does appear that after 33% single-mother families, the effect hits its minimum and turns positive. These single mother figures are pretty old (when Chetty et al.’s sample were kids). Now that the country has surpassed 40% unmarried births, I think it’s safe to say we’re out of the woods. But that’s just speculation.**

*OK, OK: “wrong wrong wrong” is going too far. Absolute rules in data visualization are often wrong wrong wrong. Binning 709 groups down to 20 is extreme. Sometimes you have a zillion points. Sometimes the plot obscures the pattern. Sometimes binning is an inherent part of measurement (we usually measure age in years, for example, not seconds). None of that is an excuse in this case. However, Carter Butts sent along an example that makes the point well:


On the other hand, the Chetty et al. case is more similar to the following extreme example:

If you were interested in the relationship between age and earnings for a sample of 1,400 full-time, year-round women, you might start with this, which is a little frustrating:


The linear relationship is hard to see, but it’s about +$500 per year of age. However, the correlation is only .13, and the variance explained by linear-age alone is only 1.7%. But if you plotted the mean wage over ages, the correlation jumps to .68:


That’s a different question. It’s not, “how does age affect earnings,” it’s, “how does age affect mean earnings.” And if you binned the women into 10-year age intervals (25-34, 35-44, 45-54), and plotted the mean wage for each group, the correlation is .86.


Chetty et al. didn’t report the final correlation, but they showed it, even adding the regression line, so that Wilcox could call it the “bivariate relationship.”

**This paragraph was a joke that several people missed, so I’m clarifying. I would never draw a conclusion like that from the scraggly tale of a loose correlation like this.


Filed under Research reports

Where is race in the Chetty et al. mobility paper?

What does race have to do with mobility? The words “race,” “black,” or “African American” don’t appear in David Leonhardt’s report on the new Chetty et al. paper on intergenerational mobility that hit the news yesterday. Or in Jim Tankersley’s report in the Washington Post, which is amazing, because it included this figure: post-race-mobility That’s not exactly a map of Black America, which the Census Bureau has produced, but it’s not that far off: census-black-2010

But even if you don’t look at the map, what if you read the paper? Describing the series of maps of intergenerational mobility, the authors write:

Perhaps the most obvious pattern from the maps in Figure VI is that intergenerational mobility is lower in areas with larger African-American populations, such as the Southeast. … Figure IXa confirms that areas with larger African-American populations do in fact have substantially lower rates of upward mobility. The correlation between upward mobility and fraction black is -0.585. In areas that have small black populations, children born to parents at the 25th percentile can expect to reach the median of the national income distribution on average (y25;c = 50); in areas with
large African-American populations, y25;c is only 35.

Here is that Figure IXa, which plots Black population composition and mobility levels for groups of commuting zones: ixa Yes, race is an important part of the story. In a nice part of the paper, the authors test whether Black population size is related to upward mobility for Whites (or, people in zip codes that are probably White, since race isn’t in their tax records), and find that it is. It’s not just Blacks driving the effect. I’m thinking about the historical patterns of industrial development, land ownership, the backwardness of racist elites in the South, and so on. But they’re not. For some reason, not explained at all, Chetty et al. offer this pivot:

The main lesson of the analysis in this section is that both blacks and whites living in areas with large African-American populations have lower rates of upward income mobility. One potential mechanism for this pattern is the historical legacy of greater segregation in areas with more blacks. Such segregation could potentially affect both low-income whites and blacks, as racial segregation is often associated with income segregation. We turn to the relationship between segregation and upward mobility in the next section.

And that’s it, they don’t discuss Black population size again, instead only focusing on racial segregation. They don’t pursue this “potential mechanism” in the analysis that follows. Instead, they drop percent Black for racial segregation. I have no idea why, especially considering this Table VII, which shows unadjusted (and normalized) correlations (more or less) between each variable and absolute upward mobility (the variable mapped above): tablevii

In these normalized correlations, fraction Black has a stronger relationship to mobility than racial segregation or economic segregation! In fact, it’s just about the strongest relationship on the whole long table (except for single mothers, with which it is of course highly correlated). So why do they not use it in their main models? Maybe someone else can explain this to me. (Full disclosure, my whole dissertation was about this variable.)

This is especially unfortunate because they do an analysis of the association between commuting zone family structure (using macro-level variables) and individual-level mobility, controlling for marital status — but not race — at the individual level. From this they conclude, “Children of married parents also have higher rates of upward mobility if they live in communities with fewer single parents.” I am quite suspicious that this effect is inflated by the omission of race at either level. So they write the following, which goes way beyond what they can find in the data:

Hence, family structure correlates with upward mobility not just at the individual level but also at the community level, perhaps because the stability of the social environment affects children’s outcomes more broadly.

Or maybe, race.

I explored the percent Black versus single mother question in a post a few weeks ago using the Chetty et al. data. I did two very simple OLS regression models using only the 100 largest commuting zones, weighted for population size, the first with just single motherhood, and then a model with proportion Black added: This shows that the association between single motherhood rates and immobility is reduced by two-thirds, and is no longer significant at conventional levels, when percent Black is added to the model. That is: Percent Black statistically explains the relationship between single motherhood and intergenerational immobility across U.S. labor markets. That’s not an analysis, it’s just an argument for keeping percent Black in the more complex models. Substantively, the level of racial segregation is just one part of the complex race story — it measures one kind of inequality in a local area, but not the amount of Black, which matters a lot (I won’t go into it all, but here are three old papers: one, two, three.

The burgeoning elite conversation about economic mobility, poverty, and inequality is good news. It’s avoidance of race is not.


Filed under Research reports