Intermarriage rates relative to diversity

Addendum: Metro-area analysis added at the end.

The Pew Research Center has a new report out on race/ethnic intermarriage, which I recommend, by Gretchen Livingston and Anna Brown. This is mostly a methodological note, which also nods at some other issues.

How do you judge the amount of intermarriage? For example, in the U.S., smaller groups — Asians and American Indians — marry exogamously at higher rates. Is that because they have fewer same-race people to choose from? Or is it because Whites shun them less than they do Blacks, which are also a larger group. To answer this, you can look at the intermarriage rates relative to group size in various ways.

The Pew report gives some detail about different groups marrying each other, but the topline number is the total intermarriage rate:

In 2015, 17% of all U.S. newlyweds had a spouse of a different race or ethnicity, marking more than a fivefold increase since 1967, when 3% of newlyweds were intermarried, according to a new Pew Research Center analysis of U.S. Census Bureau data.

Here’s one way to assess that topline number, which I’ll do by state just to illustrate the variation in the U.S. (and then I repeat this by metro area below, by popular request).*

The American Community Survey (which I download from identified people who married within the previous 12 months, whom I’ll call newlyweds. I use the 2011-2015 combined data file to increase the sample size in small states. I define intermarriage a little differently than Pew does (for convenience, not because it’s better). I call a couple intermarried if they don’t match each other in a five-category scheme: White, Black, Asian/Pacific Islander, American Indian, Hispanic. I discard those newlyweds (about 2%) who are are multiracial or specified other race and not Hispanic. I only include different-sex couples.

The Herfindahl index is used by economists to measure market concentration. It looks like this:

H =\sum_{i=1}^N s_i^2

where si is the market share of firm i in the market, and N is the number of firms. It’s the sum of the squared proportions held by each firm (or race/ethnicity). The higher the score, the greater the concentration. In race/ethnic terms, if you subtract the Herfindahl index from 1, you get the probability that two randomly selected people are in a different race/ethnic group, which I call diversity.

Consider Maine. In my analysis of newlyweds in 2011-2015, 4.55% were intermarried as defined above. The diversity calculation for Maine looks like this (ignore the scale):


So in Maine two newlyweds have a 5.2% chance of being intermarried if you scramble up the marriage applications, compared with 4.6% who are actually intermarried. (A very important decision here is to use the newlywed population to calculate diversity, instead of the single population or the total population; it’s easy to change that.) Taking the ratio of these, I calculate that Maine is operating at 87% of its intermarriage potential (4.55 / 5.23). Maybe call it a diversity-adjusted intermarriage propensity. So here are all the states (and D.C.), showing diversity and intermarriage. (The diagonal line shows what you’d get if people married at random; the two illegible clusters are DC+NY and WA+KS; click to enlarge.)

State intermarriage

How far each state is off the line is the diversity-adjusted intermarriage propensity (intermarriage divided by diversity). Here is is in map form (using maptile):


And here are the same calculations for the top 50 metro areas (in terms of number of newlyweds in the sample). I chose the top 50 by sample size of newlyweds, by which the smallest is Tucson, with a sample of 478. First, the figure (click to enlarge):

State intermarriage

And here’s the list of metro areas, sorted by diversity-adjusted intermarriage propensity:

Diversity-adjusted intermarriage propensity
Birmingham-Hoover, AL .083
Memphis, TN-MS-AR .127
Richmond, VA .133
Atlanta-Sandy Springs-Roswell, GA .147
Detroit-Warren-Dearborn, MI .155
Philadelphia-Camden-Wilmington, PA-NJ-D .157
Louisville/Jefferson County, KY-IN .170
Columbus, OH .188
Baltimore-Columbia-Towson, MD .197
St. Louis, MO-IL .204
Nashville-Davidson–Murfreesboro–Frank .206
Cleveland-Elyria, OH .213
Pittsburgh, PA .215
Dallas-Fort Worth-Arlington, TX .219
New York-Newark-Jersey City, NY-NJ-PA .220
Virginia Beach-Norfolk-Newport News, VA .224
Washington-Arlington-Alexandria, DC-VA- .224
New Orleans-Metairie, LA .229
Jacksonville, FL .234
Houston-The Woodlands-Sugar Land, TX .235
Los Angeles-Long Beach-Anaheim, CA .239
Indianapolis-Carmel-Anderson, IN .246
Chicago-Naperville-Elgin, IL-IN-WI .249
Charlotte-Concord-Gastonia, NC-SC .253
Raleigh, NC .264
Cincinnati, OH-KY-IN .266
Providence-Warwick, RI-MA .278
Milwaukee-Waukesha-West Allis, WI .284
Tampa-St. Petersburg-Clearwater, FL .286
San Francisco-Oakland-Hayward, CA .287
Orlando-Kissimmee-Sanford, FL .295
Boston-Cambridge-Newton, MA-NH .305
Buffalo-Cheektowaga-Niagara Falls, NY .305
Riverside-San Bernardino-Ontario, CA .311
Miami-Fort Lauderdale-West Palm Beach, .312
San Jose-Sunnyvale-Santa Clara, CA .316
Austin-Round Rock, TX .318
Kansas City, MO-KS .342
San Diego-Carlsbad, CA .343
Sacramento–Roseville–Arden-Arcade, CA .345
Minneapolis-St. Paul-Bloomington, MN-WI .345
Seattle-Tacoma-Bellevue, WA .346
Phoenix-Mesa-Scottsdale, AZ .362
Tucson, AZ .363
Portland-Vancouver-Hillsboro, OR-WA .378
San Antonio-New Braunfels, TX .388
Denver-Aurora-Lakewood, CO .396
Las Vegas-Henderson-Paradise, NV .406
Provo-Orem, UT .421
Salt Lake City, UT .473

At a glance no big surprises compared to the state list. Feel free to draw your own conclusions in the comments.

* I put the data, codebook, code, and spreadsheet files on the Open Science Framework here, for both states and metro areas.


Filed under In the news, Me @ work

Births to 40-year-olds are less common but a greater share than in 1960

Never before have such a high proportion of all births been to women over 40 — they are now 2.8% of all births in the US. And yet a 40-year-old woman today is one-third less likely to have a baby than she was in 1947.

From 1960 to 1980, birth rates to women over 40* fell, as the Baby Boom ended and people were having fewer children by stopping earlier. Since 1980 birth rates to women over 40 have almost tripled as people started “starting” their families at later ages, but they’re still lower than they were back when total fertility was much higher.


Sources: Birth rates 1940-1969, 1970-2010, 2011, 2012-2013, 2014-20152016; Percent of births 1960-1980, 1980-2008.

Put another way, a child born to a mother over 40 before 1965 was very likely the youngest of several (or many) siblings. Today they are probably the youngest of 2 or an only child. A crude way to show this is to use the Current Population Survey to look at how many children are present in the households of women ages 40-49 who have a child age 0 (the survey doesn’t record births as events, but the presence of a child age 0 is pretty close). Here is that trend:


In the 1970s about 60 percent of children age 0 had three or more siblings present, and only 1 in 20 was an only child. Now more than a quarter are the only child present and another 30 percent only have one sibling present. (Note this doesn’t show however many siblings no longer live in the household, and I don’t know how that might have changed over the years).

This updates an old post that focused on the health consequences of births to older parents. The point from that post remains: there are fewer children (per woman) being born to 40-plus mothers today than there were in the past, it just looks like there are more because they’re a larger share of all children.

* Note in demography terms, “over 40” means older than “exact age” 40, so it includes people from the moment they turn 40.

Leave a comment

Filed under In the news

Now-you-know data graphic series

As I go about my day, revising my textbook, arguing with Trump supporters online, and looking at data, I keep an eye out for easily-told data short stories. I’ve been putting them on Twitter under the label Now You Know, and people seem to appreciate it, so here are some of them. Happy to discuss implications or data issues in the comments.

1. The percentage of women with a child under age 1 rose rapidly to the late 1990s and then stalled out. The difference between these two lines is the percentage of such women who have a job but were not at work the week of the survey, which may mean they are on leave. That gap is also not growing much anymore, which might or might not be good.

2. In the long run both the dramatic rise and complete stall of women’s employment rates are striking. I’m not as agitated about the decline in employment rates for men as some are, but it’s there, too.

3. What looked in 2007 like a big shift among mothers away from paid work as an ideal — greater desire for part-time work among employed mothers, more desire for no work among at-home mothers — hasn’t held up. From a repeated Pew survey. Maybe people have looked this from other sources, too, so we can tell whether these are sample fluctuations or more durable swings.

4. Over age 50 or so divorce is dominated by people who’ve been married more than once, especially in the range 65-74 — Baby Boomers, mostly — where 60% of divorcers have been married more than once.


5. People with higher levels of education receive more of the child support they are supposed to get.


Leave a comment

Filed under Me @ work

African American marital status by age, Du Bois replication edition

At the 1900 Paris Exposition, sociologist W. E. B. Du Bois presented some the work of his students. In The Scholar Denied: W. E. B. Du Bois and the Birth of Modern Sociology, Aldon Morris writes:

Du Bois’s meticulousness as a teacher is apparent in the charts and graphs that he prepared with his students. For example, as part of his gold medal-winning exhibit for the 1900 Paris Exposition, Du Bois and his students produced detailed hand-drawn artistically colored graphs and charts that depicted the journey of black Georgians from slavery to freedom.

Some of collection is shown in this post at the Public Domain Review (shared by Tressie McMillan Cottom yesterday); the full collection is online at the Library of Congress (LOC).

The one that caught my eye was this, showing marital status (“conjugal condition”) by age and sex for the Black population. I can’t find the source details in the LOC record, so I don’t know if it’s Georgia or national, but I presume it’s from tabulations of 1890 decennial census or earlier:


It’s artistic and meticulous and clearly informative, beautiful. So I tried to make a 2015 update to complement it. I used data from the 2015 American Community Survey via, and did it a little differently.* Most importantly, I added two more conjugal conditions, cohabiting and separated/divorced. Second, I used five-year age groupings all the way up, instead of ten. Third, I detailed the age groups up to age 85. Here’s what I got:

du bois marstat replication.xlsx

Some very big differences: Much smaller proportions of African Americans married now. Also, much later marriage. In the 1900 figure more than 30% of men and 60% of women have been married by age 25; those numbers are 5-6% now. I don’t know how they counted separated/divorced people in 1900, but those numbers are high now at 31% for women and 24% for men at age 60-64. Widowhood is later now, as 42% of women were widowed before age 65 in 1900, compared with only 13% now (of course, that’s off a lower marriage rate, and remarried people are just counted as married). And of course cohabitation, which the chart doesn’t show for 1900. Note I included people in same-sex as well as different-sex couples.

So, thanks for indulging me. I hope you don’t think it’s frivolous. I just love staring at the old charts, and going through the (very different) steps of replicating it was really satisfying. (I also just love that in another 100 years someone might look back on this and say, “Wait, which one was Earth again?”)

Note: If you want to compare them side-by-side, here’s a go at that. The age ranges don’t line up perfectly but you can get the idea (click to enlarge):

* SAS code, ACS data, images, and the spreadsheet used for this post are shared as an Open Science Framework project, here.


Filed under Me @ work