Tag Archives: pew

Intermarriage rates relative to diversity

Addendum: Metro-area analysis added at the end.

The Pew Research Center has a new report out on race/ethnic intermarriage, which I recommend, by Gretchen Livingston and Anna Brown. This is mostly a methodological note, which also nods at some other issues.

How do you judge the amount of intermarriage? For example, in the U.S., smaller groups — Asians and American Indians — marry exogamously at higher rates. Is that because they have fewer same-race people to choose from? Or is it because Whites shun them less than they do Blacks, which are also a larger group. To answer this, you can look at the intermarriage rates relative to group size in various ways.

The Pew report gives some detail about different groups marrying each other, but the topline number is the total intermarriage rate:

In 2015, 17% of all U.S. newlyweds had a spouse of a different race or ethnicity, marking more than a fivefold increase since 1967, when 3% of newlyweds were intermarried, according to a new Pew Research Center analysis of U.S. Census Bureau data.

Here’s one way to assess that topline number, which I’ll do by state just to illustrate the variation in the U.S. (and then I repeat this by metro area below, by popular request).*

The American Community Survey (which I download from IPUMS.org) identified people who married within the previous 12 months, whom I’ll call newlyweds. I use the 2011-2015 combined data file to increase the sample size in small states. I define intermarriage a little differently than Pew does (for convenience, not because it’s better). I call a couple intermarried if they don’t match each other in a five-category scheme: White, Black, Asian/Pacific Islander, American Indian, Hispanic. I discard those newlyweds (about 2%) who are are multiracial or specified other race and not Hispanic. I only include different-sex couples.

The Herfindahl index is used by economists to measure market concentration. It looks like this:

H =\sum_{i=1}^N s_i^2

where si is the market share of firm i in the market, and N is the number of firms. It’s the sum of the squared proportions held by each firm (or race/ethnicity). The higher the score, the greater the concentration. In race/ethnic terms, if you subtract the Herfindahl index from 1, you get the probability that two randomly selected people are in a different race/ethnic group, which I call diversity.

Consider Maine. In my analysis of newlyweds in 2011-2015, 4.55% were intermarried as defined above. The diversity calculation for Maine looks like this (ignore the scale):


So in Maine two newlyweds have a 5.2% chance of being intermarried if you scramble up the marriage applications, compared with 4.6% who are actually intermarried. (A very important decision here is to use the newlywed population to calculate diversity, instead of the single population or the total population; it’s easy to change that.) Taking the ratio of these, I calculate that Maine is operating at 87% of its intermarriage potential (4.55 / 5.23). Maybe call it a diversity-adjusted intermarriage propensity. So here are all the states (and D.C.), showing diversity and intermarriage. (The diagonal line shows what you’d get if people married at random; the two illegible clusters are DC+NY and WA+KS; click to enlarge.)

State intermarriage

How far each state is off the line is the diversity-adjusted intermarriage propensity (intermarriage divided by diversity). Here is is in map form (using maptile):


And here are the same calculations for the top 50 metro areas (in terms of number of newlyweds in the sample). I chose the top 50 by sample size of newlyweds, by which the smallest is Tucson, with a sample of 478. First, the figure (click to enlarge):

State intermarriage

And here’s the list of metro areas, sorted by diversity-adjusted intermarriage propensity:

Diversity-adjusted intermarriage propensity
Birmingham-Hoover, AL .083
Memphis, TN-MS-AR .127
Richmond, VA .133
Atlanta-Sandy Springs-Roswell, GA .147
Detroit-Warren-Dearborn, MI .155
Philadelphia-Camden-Wilmington, PA-NJ-D .157
Louisville/Jefferson County, KY-IN .170
Columbus, OH .188
Baltimore-Columbia-Towson, MD .197
St. Louis, MO-IL .204
Nashville-Davidson–Murfreesboro–Frank .206
Cleveland-Elyria, OH .213
Pittsburgh, PA .215
Dallas-Fort Worth-Arlington, TX .219
New York-Newark-Jersey City, NY-NJ-PA .220
Virginia Beach-Norfolk-Newport News, VA .224
Washington-Arlington-Alexandria, DC-VA- .224
New Orleans-Metairie, LA .229
Jacksonville, FL .234
Houston-The Woodlands-Sugar Land, TX .235
Los Angeles-Long Beach-Anaheim, CA .239
Indianapolis-Carmel-Anderson, IN .246
Chicago-Naperville-Elgin, IL-IN-WI .249
Charlotte-Concord-Gastonia, NC-SC .253
Raleigh, NC .264
Cincinnati, OH-KY-IN .266
Providence-Warwick, RI-MA .278
Milwaukee-Waukesha-West Allis, WI .284
Tampa-St. Petersburg-Clearwater, FL .286
San Francisco-Oakland-Hayward, CA .287
Orlando-Kissimmee-Sanford, FL .295
Boston-Cambridge-Newton, MA-NH .305
Buffalo-Cheektowaga-Niagara Falls, NY .305
Riverside-San Bernardino-Ontario, CA .311
Miami-Fort Lauderdale-West Palm Beach, .312
San Jose-Sunnyvale-Santa Clara, CA .316
Austin-Round Rock, TX .318
Kansas City, MO-KS .342
San Diego-Carlsbad, CA .343
Sacramento–Roseville–Arden-Arcade, CA .345
Minneapolis-St. Paul-Bloomington, MN-WI .345
Seattle-Tacoma-Bellevue, WA .346
Phoenix-Mesa-Scottsdale, AZ .362
Tucson, AZ .363
Portland-Vancouver-Hillsboro, OR-WA .378
San Antonio-New Braunfels, TX .388
Denver-Aurora-Lakewood, CO .396
Las Vegas-Henderson-Paradise, NV .406
Provo-Orem, UT .421
Salt Lake City, UT .473

At a glance no big surprises compared to the state list. Feel free to draw your own conclusions in the comments.

* I put the data, codebook, code, and spreadsheet files on the Open Science Framework here, for both states and metro areas.


Filed under In the news, Me @ work

Now-you-know data graphic series

As I go about my day, revising my textbook, arguing with Trump supporters online, and looking at data, I keep an eye out for easily-told data short stories. I’ve been putting them on Twitter under the label Now You Know, and people seem to appreciate it, so here are some of them. Happy to discuss implications or data issues in the comments.

1. The percentage of women with a child under age 1 rose rapidly to the late 1990s and then stalled out. The difference between these two lines is the percentage of such women who have a job but were not at work the week of the survey, which may mean they are on leave. That gap is also not growing much anymore, which might or might not be good.

2. In the long run both the dramatic rise and complete stall of women’s employment rates are striking. I’m not as agitated about the decline in employment rates for men as some are, but it’s there, too.

3. What looked in 2007 like a big shift among mothers away from paid work as an ideal — greater desire for part-time work among employed mothers, more desire for no work among at-home mothers — hasn’t held up. From a repeated Pew survey. Maybe people have looked this from other sources, too, so we can tell whether these are sample fluctuations or more durable swings.

4. Over age 50 or so divorce is dominated by people who’ve been married more than once, especially in the range 65-74 — Baby Boomers, mostly — where 60% of divorcers have been married more than once.


5. People with higher levels of education receive more of the child support they are supposed to get.


Leave a comment

Filed under Me @ work

Fertility trends and the myth of Millennials

The other day I showed trends in employment and marriage rates, and made the argument that the generational term “Millennial” and others are not useful: they are imposed before analyzing data and then trends are shoe-horned into the categories. When you look closely you see that the delineation of “generations” is arbitrary and usually wrong.

Here’s another example: fertility patterns. By the definition of “Millennial” used by Pew and others, the generation is supposed to have begun with those born after 1980. When you look at birth rates, however,  you see a dramatic disruption within that group, possibly triggered by the timing of the 2009 recession in their formative years.

I do this by using the American Community Survey, conducted annually from 2001 to 2015, which asks women if they have had a birth in the previous year. The samples are very large, with all the data points shown including at least 8,000 women and most including more than 60,000.

The figure below shows the birth rates by age for women across six five-year birth cohorts. The dots on each line mark the age at which the midpoint of each cohort reached 2009. The oldest three groups are supposed to be “Generation X.” The three youngest groups shown in yellow, blue, and green — those born 1980-84, 1985-89, and 1990-94 — are all Millennials according to the common myth. But look how their experience differs!

cohort birth rates ACS.xlsx

Most of the fertility effect on the recession was felt at young ages, as women postponed births. The oldest Millennial group was in their late twenties when the recession hit, and it appears their fertility was not dramatically affected. The 1985-89 group clearly took a big hit before rebounding. And the youngest group started their childbearing years under the burden of the economic crisis, and if that curve at 25 holds they will not recover. Within this arbitrarily-constructed “generation” is a great divergence of experience driven by the timing of the great recession within their early childbearing years.

You could collapse these these six arbitrary birth cohorts into two arbitrary “generations,” and you would see some of the difference I describe. I did that for you in the next figure, which is made from the same data. And you could make up some story about the character and personality of Millennials versus previous generations to fit that data, but you would be losing a lot of information to do that.

cohort birth rates ACS.xlsx

Of course, any categories reduce information — even single years of age — so that’s OK. The problem is when you treat the boundaries between categories as meaningful before you look at the data — in the absence of evidence that they are real with regard to the question at hand.

Leave a comment

Filed under In the news

Two examples of why “Millennials” is wrong

When you make up “generation” labels for arbitrary groups based on year of birth, and start attributing personality traits, behaviors, and experiences to them as if they are an actual group, you add more noise than light to our understanding of social trends.

According to generation-guru Pew Research, “millennials” are born during the years 1981-1997. A Pew essay explaining the generations carefully explains that the divisions are arbitrary, and then proceeds to analyze data according to these divisions as if are already real. (In fact, in the one place the essay talks about differences within generations, with regard to political attitudes, it’s clear that there is no political consistency within them, as they have to differentiate between “early” and “late” members of each “generation.”)

Amazingly, despite countless media reports on these “generations,” especially millennials, in a 2015 Pew survey only 40% of people who are supposed to be millennials could pick the name out of a lineup — that is, asked, “These are some commonly used names for generations. Which of these, if any, do you consider yourself to be?”, and then given the generation names (silent, baby boom, X, millennial), 40% of people born after 1980 picked “millennial.”

“What do they know?” You’re saying. “Millennials.

Two examples

The generational labels we’re currently saddled with create false divisions between groups that aren’t really groups, and then obscure important variation within the groups that are arbitrarily lumped together. Here is just one example: the employment experience of young men around the 2009 recession.

In this figure, I’ve taken three birth cohorts: men born four years apart in 1983, 1987, and 1991 — all “millennials” by the Pew definition. Using data from the 2001-2015 American Community Surveys via IPUMS.org, the figure shows their employment rates by age, with 2009 marked for each, coming at age 26, 22, and 18 respectively.


Each group took a big hit, but their recoveries look pretty different, with the earlier cohort not recovered as of 2015, while the youngest 1991 group bounced up to surpass the employment rates of the 1987s by age 24. Timing matters. I reckon the year they hit that great recession matters more in their lives than the arbitrary lumping of them all together compared with some other older “generations.”

Next, marriage rates. Here I use the Current Population Survey and analyze the percentage of young adults married by year of birth for people ages 18-29. This is from a regression that controls for year of age and sex, so it can be interpreted as marriage rates for young adults (click to enlarge).


From the beginning of the Baby Boom generation to those born through 1987 (who turned 29 in 2016, the last year of CPS data), the marriage rate fell from 57% to 21%, or 36 percentage points. Most of that change, 22 points, occurred within the Baby Boom. The marriage experience of the “early” and “late” Baby Boomers is not comparable at all. The subsequent “generations” are also marked by continuously falling marriage rates, with no clear demarcation between the groups. (There is probably some fancy math someone could do to confirm that, with regard to marriage experience, group membership by these arbitrary criteria doesn’t tell you more than any other arbitrary grouping would.)

Anyway, there are lots of fascinating and important ways that birth cohort — or other cohort identifiers — matter in people’s lives. And we could learn more about them if we looked at the data before imposing the categories.


Filed under Research reports

Couple fact patterns about sexuality and attitudes

Working on the second edition of my book, The Family, involves updating facts as well as rethinking their presentation, and the choice of what to include. The only way I can do that is by making figures to look at myself. Here are some things I’ve worked up recently; they might not end up in the book, but I think they’re useful anyway.

1. Attitudes on sexuality and related family matters continue to grow more accepting or tolerant, but acceptance of homosexuality is growing faster than the others – at least those measured in the repeated Gallup surveys:


2. Not surprisingly, there is wide divergence in the acceptance of homosexuality across religious groups. This uses the Pew Religious Landscape Study, which includes breakouts for atheists, agnostics, and two kinds of “nones,” or unaffiliated people — those for whom religion is important and those for whom it’s not:


3. Updated same-sex behavior and attraction figures from the National Survey of Family Growth. For some reason the NSFG reports don’t include the rates of same-sex partner behavior in the previous 12 months for women anymore, so I analyzed the data myself, and found a much lower rate of last-year behavior among women than they reported before (which, when I think about it, was unreasonably high – almost as high as the ever-had-same-sex-partner rates for women). Anyway, here it is:


FYI, people who follow me on Twitter get some of this stuff quicker; people who follow on Instagram get it later or not at all.


Filed under Research reports

On Asian-American earnings

In a previous post I showed that generalizations about Asian-American incomes often are misleading, as some groups have above-average incomes and some have below-average incomes (also, divorce rates) and that inequality within Asian-American groups was large as well. In this post I briefly expand that to show breakdowns in individual earnings by gender and national-origin group.

The point is basically the same: This category is usually not useful for economic statistics, and should usually be dropped for data on specific groups when possible.

Today’s news

What’s new is a Pew report by Eileen Patten showing trends in race and gender wage gaps. The report isn’t focused on Asian-American earnings, but they stand out in their charts. This led Charles Murray, who is fixated on what he believes is the genetic origin of Asian cognitive superiority, to tweet sarcastically, “Oppose Asian male privilege!” Here is one of Pew’s charts:


The figure, using the Current Population Survey (CPS), shows Asian men earning about 14.5% more per hour than White men, and Asian women earning 11% more than White women. This is not wrong, exactly, but it’s not good information either, as I’ll argue below.

First a note on data

The CPS data is better for some labor force questions (including wages) than the American Community Survey, which is much larger. However, it’s too small a sample to get into detail on Asian subgroups (notice the Pew report doesn’t mention American Indians, an even smaller group). To do that I will need to activate the ACS, which is better for race/ethnic detail.

As a reminder, this is the “race” question on the 2014 American Community Survey, which I use for this post:


There is no “Asian” or “Pacific Islander” box to check. So what do you do if you are thinking, “I’m Asian, what do I check?” The question is premised on that assumption that is not what you’re thinking. Instead, you choose from a list of national origins, which the Census Bureau then combines to make “Asian” (the first 7 boxes) and “Pacific Islander” (the last 3) categories. And you can check as many as you like, which is good because there’s a lot of intermarriage among Asians, and between Asians and other groups (mostly Whites). This is a lot like the Hispanic origin question, which also lists national origins — except that question is prefaced by the unifying phrase, “Is Person 1 of Hispanic, Latino, or Spanish origin?” before listing the options, each beginning with “Yes”, as in “Yes, Cuban.”

Although changes have not been announced, it is likely that future questions will combine the race and Hispanic-origin questions, and also preface the Asian categories with the umbrella term. This may mark the progress of getting Asian immigrants to internalize the American racial classification system, so that descendants from groups that in some cases have centuries-old cultural differentiation start to identify and label themselves as from the same racial group (who would have put Pakistanis and Japanese in the same “race” group 100 years ago?). It’s hard to make this progress, naturally, when so many people from these groups are immigrants — in my sample below, for example, 75% of the full-time, year-round workers are foreign-born.


The problem with the earnings chart Pew posted, and which Charles Murray loved, is that it lumps all the different Asian-origin groups together. That is not crazy but it’s not really good. Of course every group has diversity within it, so any category masks differences, but in my opinion this Asian grouping is worse in that regard than most. If someone argued that all these groups see themselves as united under a common identity that would push me in the direction of dropping this complaint. In any event, the diversity is interesting even if you don’t object to the Pew/Census grouping.

Here are two breakouts. The first is immigration. As I noted, 75% of the full-time, year-round workers (excluding self-employed people, like Pew does) with an Asian/Pacific Islander (Asian for short) racial identification are foreign born. That ranges from less than 4% for Hawaiians, to around 20% for the White+Asian multiple-race people, to more than 90% for Asian Indian men. It turns out that the wage advantage is mostly concentrated among these immigrants. Here is a replication of the Pew chart using the ACS data (a little different because I had to use FTFY workers), using the same colors. On the left is their chart, on the right is the same data limited to US-born workers.


Among the US-born workers the Asian male advantage is reduced from 14.5% to 4.2% (the women’s advantage is not much changed; as in Pew’s chart, Hispanics are a mutually exclusive category.) There are some very high-earning Asian immigrants, especially Indians. Here are the breakdowns, by gender, comparing each of the larger Asian-American groups to Whites:


Seven groups of men and nine groups of women have hourly earnings higher than Whites’, while nine groups of men and seven groups have women have lower earnings. In fact, among Laotians, Hawaiians, and Hmong, even the men earn less than White women. (Note, in my old post, I showed that Asian household incomes are not as high as they look when they are compared instead with those of their local peers, because they are concentrated in expensive metropolitan markets.)

Sometimes when I have a situation like this I just drop the relatively small, complex group, which leads some people to accuse me of trying to skew results. (For example, I might show a chart that has Blacks in the worst position, even though American Indians have it even worse.)

But generalization has consequences, so we should use it judiciously. In most cases “Asian” doesn’t work well. It may make more sense to group people by regions, such as East-, South-, and Southeast Asia, and/or according to immigrant status.


Filed under In the news

Old people are getting older and younger

The Pew Research Center recently put out a report on the share of U.S. older women living alone. The main finding they reported was a reversal in the long trend toward old women living alone after 1990. After rising to a peak of 38% in 1990, the share of women age 65+ living alone fell to 32% by 2014. It’s a big turnaround. The report attributes it in part to the rising life expectancy of men, so fewer old women are widowed.


The tricky thing about this is the changing age distribution of the old population (the Pew report breaks the group down into 65-84 versus 85+, but doesn’t dwell on the changing relative size of those two groups). Here’s an additional breakdown, from the same Census data Pew used (from IPUMS.org), showing percent living alone by age for women:


Two things in this figure: the percent living alone is much lower for the 65-69s, and the decline in living alone is much sharper in the older women.

The age distribution in the 65+ population has changed in two ways: in the long run it’s getting older as life expectancy at old age increases. However, the Baby Boom (born 1946-1964) started hitting age 65 in 2010, resulting in a big wave of 65-69s pouring into the 65+ population. You can see both trends in the following figure, which shows the age distribution of the 65+ women (the lines sum to 100%). The representation of 80+ women has doubled since 1960, showing longer life expectancy, but look at that spike in the 65-69s!


Given this change in the trends, you can see that the decrease in living alone in the 65+ population partly reflects greater representation of young-old women in the population. These women are less likely to live alone because they’re more likely to still be married.

On the other hand, why is there such a steep drop in living alone among 80+ women? Some of this is the decline in widowhood as men live longer. But it’s an uphill climb, because among this group there is no Baby Boom spike of young-olds (yet) — the 80+ population is still just getting older and older. Here’s the age distribution among 80+ women (these sum to 100 again):


You can see the falling share of 80-84s as the population ages. If this is the group that is less likely to live alone the most because their husbands are living longer, that’s pretty impressive, because the group is aging fast. One boost the not-alones get is that they are increasingly likely to live in extended households — since 1990 there’s been a 5% increase in them living in households of at least 3 people, from 13% to 18%. Finally, at this age you also have to look at the share living in nursing homes (some of whom seem to be counted as living alone and some not).

In addition to the interesting gerentological questions this all raises, it’s a good reminder that the Baby Boom can have sudden effects on within-group age distributions (as I discussed previously in this post on changing White mortality patterns). Everyone should check their within-group distributions when assessing trends over time.

Leave a comment

Filed under In the news