Tag Archives: census

African American marital status by age, Du Bois replication edition

At the 1900 Paris Exposition, sociologist W. E. B. Du Bois presented some the work of his students. In The Scholar Denied: W. E. B. Du Bois and the Birth of Modern Sociology, Aldon Morris writes:

Du Bois’s meticulousness as a teacher is apparent in the charts and graphs that he prepared with his students. For example, as part of his gold medal-winning exhibit for the 1900 Paris Exposition, Du Bois and his students produced detailed hand-drawn artistically colored graphs and charts that depicted the journey of black Georgians from slavery to freedom.

Some of collection is shown in this post at the Public Domain Review (shared by Tressie McMillan Cottom yesterday); the full collection is online at the Library of Congress (LOC).

The one that caught my eye was this, showing marital status (“conjugal condition”) by age and sex for the Black population. I can’t find the source details in the LOC record, so I don’t know if it’s Georgia or national, but I presume it’s from tabulations of 1890 decennial census or earlier:


It’s artistic and meticulous and clearly informative, beautiful. So I tried to make a 2015 update to complement it. I used data from the 2015 American Community Survey via IPUMS.org, and did it a little differently.* Most importantly, I added two more conjugal conditions, cohabiting and separated/divorced. Second, I used five-year age groupings all the way up, instead of ten. Third, I detailed the age groups up to age 85. Here’s what I got:

du bois marstat replication.xlsx

Some very big differences: Much smaller proportions of African Americans married now. Also, much later marriage. In the 1900 figure more than 30% of men and 60% of women have been married by age 25; those numbers are 5-6% now. I don’t know how they counted separated/divorced people in 1900, but those numbers are high now at 31% for women and 24% for men at age 60-64. Widowhood is later now, as 42% of women were widowed before age 65 in 1900, compared with only 13% now (of course, that’s off a lower marriage rate, and remarried people are just counted as married). And of course cohabitation, which the chart doesn’t show for 1900. Note I included people in same-sex as well as different-sex couples.

So, thanks for indulging me. I hope you don’t think it’s frivolous. I just love staring at the old charts, and going through the (very different) steps of replicating it was really satisfying. (I also just love that in another 100 years someone might look back on this and say, “Wait, which one was Earth again?”)

Note: If you want to compare them side-by-side, here’s a go at that. The age ranges don’t line up perfectly but you can get the idea (click to enlarge):

* SAS code, ACS data, images, and the spreadsheet used for this post are shared as an Open Science Framework project, here.


Filed under Me @ work

Is there sex selection among Asian immigrants in the US?

There is a 2008 paper reported in the New York Times in 2009, which found skewed sex ratios among children of immigrants from China, Korea, and India, if their older siblings were girls, using the 2000 Census. The implication was that some parents were using IVF or abortion to select boy children if their first two were girls — as is the case in their home countries. There has been some other research on this from the early 2000s, but I haven’t seen it updated since then.

I took a quick stab at it, but don’t have time right now to pursue it more thoroughly. So here’s the quick answer I got, and I shared my data, code, and results in an Open Science Framework project, here. I hope someone will be interested and pursue it further (using my approach or not). The files there include all different ethnic/racial groups.

This is preliminary.

Using the American Community Survey data from 2010-2015, from IPUMS.org, I took U.S.-born children ages 0-5, whose parents were both born in China, Korea, or India and both were present in the household. I counted the sex of any present siblings under age 15 (excluding step- and adopted children). Then I restricted the data to those with 2 older siblings, and compared the sex ratios among those who had 0 or 1 older sister to those who had 2 older sisters. I did this in a logistic regression controlling for individual years of age, and using ACS person weights. There are judgment calls to make about age, siblings, data and other issues. The older you get the more likely you are to have kids moving out in a way that is not sex-neutral (for example, if parents with girls are more or less likely to divorce), and so on. Should parents be matched on immigration status, siblings born abroad included, why the years 2010-2015, and so on. This is what I mean by preliminary. But these results are interesting enough to prompt me to post them and encourage discussion and more analysis.

Here’s what I got:

sex selection.xlsx

The sex differences between those with 0/1 older sister and 2 older sisters are not statistically significant at p.<.05 in each of the three groups, but they are for the combined set (.046). These comparison involve a few hundred cases. Here are the unweighted, unadjusted results:


As you can see, just a few families intervening to choose boys — or some other force rearranging the living arrangements, or survival, of children and families, and the difference would not hold. Still, I think it’s worth pursuing. Maybe someone already has. If you decide to get into it, feel free to use this stuff, and let me know what you come up with!


Filed under Me @ work

Why Heritage is wrong on the new Census race/ethnicity question

Sorry this is long and rambly. I just want to get the main points down and I’m in the middle of other things. I hope it helps.

Mike Gonzalez, a Bush-era speech writer with no background in demography (not that there’s anything wrong with that), now a PR person for the Heritage Foundation, has written a noxious and divisive op-ed in the Washington Post that spreads some completely wrong information about the U.S. Census Bureau’s attempts to improve data collection on race and ethnicity. It’s also a scary warning of what the far right politicization of the Census Bureau might mean for social science and democracy.

Gonzalez is upset that “the Obama administration is rushing to institute changes in racial classifications,” which include two major changes: combining the Hispanic/Latino Origin question with the Race question, and adding a new category, Middle Eastern or North African (MENA). Gonzalez (who, it must be noted, perhaps with some sympathy, recently wrote one of those useless books about how the Republican party can reach Hispanics, made instantly obsolete by Trump), says that what Obama has in mind “will only aggravate the volatile social frictions that created today’s poisonous political climate in the first place.” Yes, the “poisonous political climate” he is upset about (did I mention he works for the Heritage Foundation?) is the result of the way the government divides people by race and ethnicity. Not actually dividing them, of course (which is a real problem), but dividing them on Census forms. (I hadn’t heard this particular version of why Trump is Obama’s fault — who knew?)

How will the new reforms make the Trump situation he helped create worse? Basically, by measuring race and ethnicity, which Gonzalez would rather not do (as suggested by the title, “Think of America as one people? The census begs to differ,” which could have been written at any time in the past two centuries).

Specifically, Gonzalez claims, completely factually inaccurately, that Census would “eliminate a second question that lets [Hispanics] also choose their race.” By combining Hispanic origin and race into one question — on which, as before, people will be free to mark as many responses as they like — Gonzalez thinks Census would “effectively make ‘Hispanic’ their sole racial identifier.” He is upset that many Latinos will not identify themselves as “White” if they have the option of “Hispanic” on the same question, even if they are free to mark both (which he doesn’t mention). Some will, but that is not because anyone is taking away any of their choices.

The Census Bureau, of course, because they always do, because they are excellent, has done years of research on these questions, including all the major stakeholders in a long interactive process that is scrupulously documented and (for a government bureaucracy) quite transparent. Naturally not everyone is happy, but in the end the trained demographic professionals come down on the side of the best science.

Race that Latino

The most recent report on the research I found was a presentation by Nicholas Jones and Michael Bentley from the Census Bureau. This is my source for the research on the new question.

First, why combine Hispanic with race? You have probably seen the phrase “Hispanics may be of any race” on lots of reports that use Census or other government data. The figure below is from the first edition of my book, using 2010 data, in which I group all 50 million Hispanics, and show the races they chose: about half White, the rest other race or more than one race (usually White and other race). Notice that by this convention Hispanics are removed from the White group anyway, just because we don’t want to have people in the same picture twice (“non-Hispanic Whites” is already a common construction).


The “may be of any race” language is the awkward outcome of an approach that treats Hispanic as an “ethnicity” (actually a bunch of national origins, maybe a panethnicity), while White, Black, Asian, Pacific Islander, and American Indian are treated as “races.” The distinction never really made sense. These things have been measured using self-identification for more than half a century, so we’re not talking about genetics and blood tests, we’re talking about how people identify themselves. And there just isn’t a major categorical difference between race and ethnicity for most people — people of any race or ethnicity may identify with a specific national origin (Italian, Pakistani, Mexican), as well as a “race” or panethnic identify such as Asian, or Latino. And now that the government allows people to select multiple races (since 2000), as well as answering the Hispanic question, there really is no good justification for keeping them separate. As you can see from my figure above, when we analyze the data we mostly pull all the Hispanics together regardless of their races. The new approach just encourages them to decide how they want that done, which is usually a better approach.

Of course, Asians and Pacific Islanders have been answering the “race” question with national origin prompts for several decades. There was no “Asian” checkbox in 2000 or 2010 (or on the American Community Survey). So they have been using their ethnicity to answer the race question all along — that’s because for some reason you just can’t get “Asian” immigrants, especially recent immigrants — that is, people from India, Korea, and Japan, Vietnam, and so on — to see themselves as part of one panethnic group. Go figure, must be the centuries of considering themselves separate peoples, even “races.” So, a new question that combines the more ethnic categories (Mexican, Pakistanis, etc.), with America’s racial identities (Black, White, etc.), just works better, as long as you let people check as many boxes as they want. This is what the “race” question looked like in 2014. Note there is no “Asian” checkbox:


As a general guide, the questionnaire scheme works best when (a) everyone has a category they like, and (b) few people choose “other.” That is the system that will yield the most scientifically useful data. It also will tend to match the way people interact socially, including how they discriminate against each other, burn crosses on each other’s lawns, and randomly attack each other in public. We want data that helps us understand all that.

Through extensive testing, it has become apparent that, when given a question that offers both race and Hispanic origin together, Latino respondents are much more likely to answer Hispanic/Latino only, rather than cluttering up the race question with “some other race” responses (often writing in “Hispanic” or “Latino” as their “other race”). If I read the presentation right, in round numbers, given the choice of answering the “race” question with “Hispanic,” in the test data about 70% chose Hispanic alone; about 20% chose White along with Hispanic, and 5% choose two races. In fact, the number of Latinos saying their only race is White probably won’t change much; the biggest difference is that you no longer have almost 40% of Latinos saying they are “some other race,” or choosing more than one race (usually White and Other) which usually just means they don’t see a race that fits them on the list.

In the end, the size of the major groups (Hispanics and the major races) are not changed much. Here’s the summary:


In fact, the only major group that will shrink is probably the non-group “multiracial” population, which today is dominated by Hispanics choosing White and “some other race.”

It’s really just better data. It’s not a conspiracy. It’s not eliminating the White race or discouraging assimilation of Hispanics. In short, keep calm and collect better data. We can fight about all that other stuff, too.

I’m sure Gonzalez doesn’t really think this will “eliminate Hispanics’ racial choices.” He’s dog-whistling to people who think the government is trying to reduce the number of Whites by not letting Hispanics be White. His statements are factually incorrect and the Washington Post shouldn’t have printed them. (I don’t know how the Post does Op-Eds; when I wrote one for the NY Times it was pretty thoroughly fact-checked.)


The Migration Policy Institute estimates there are about 2 million MENAs in the U.S. now, about half of them immigrants. This is a pretty small population, mostly Arab-speaking immigrants and their descendants, and more Christian (relative to Muslim) than the countries they left. This is especially true of the more recent immigrants, which don’t include a lot of Iranians (who aren’t Arab).

Census could have instead defined them by linguistic origin (Arab), and captured most, but they instead are going with country of origin, which is consistent with how the other race/ethnic groups are identified (for better or worse). Their testing showed that this measure captures most people with MENA ancestry, encourages them to identify their ancestry, cuts down on them identifying as White, and cuts down on them using “some other race.”

The difference is dramatic for those identifying as White, which fell from 85% to 20% in the test once a MENA category was offered. Would it be better if they just identified as White? I’m really not trying to shrink the count of Whites, I just think this is more accurate. I don’t care about the biology of Whiteness and whether Iranians are part of it, for example (and don’t ever say “Caucasian,” please), I care about the experience and identity of the people we’re talking about — as well as the beliefs of the people who hate them and those who want to protect them from discrimination. Counting them seems better than shoehorning them into a category most of them avoid when given the chance.

Here’s one version of the proposed new combined question, from that Census presentation:



Why not Mike Gonzalez to run Census? Unbelievably, he probably knows more about it than Trump’s education and HUD department heads know about their new portfolios.

But that’s just one odious possibility. It makes me kind of sick to think of the possible idiots and fanatics Trump might put in charge of the Census Bureau, after all this work on research and testing, designed to get the best data we can out of a very messy and imperfect situation.

What else would they do? Will they continue to develop ways to identify and count same-sex couples? The Supreme Court says they can get married, but there is no law that says the Census Bureau has to count them. What about multilingual efforts to reach immigrant communities? This has been a focus of Census Bureau development as well. And so on.

It is absolutely in Trump’s interest, and the interests of those who he serves (not the people who voted for him), to reduce the quality and quantity of social science data the government produces and enables us to produce.


Filed under In the news

How do Black-White parents identify their children?

In 2015 the American Community Survey yields an estimate of 66,913 infants who have one Black parent and one White parent present in the household. (Either parent may be multiracial, too.)

What is the race of those infants? 73% of them were identified as both White and Black by whoever filled out the Census form.


(Note “other” doesn’t mean they specified “other,” it just means they used some other combination of races.)

These are children age 0 living with both parents, so it’s a pretty good bet they’re mostly biological parents, though some are presumably adopted. This is based on a sample of 507 such infants. If you pooled some years of ACS there is plenty to study here. Someone may already have done this – feel free to post in the comments.

That’s it, just FYI.

1 Comment

Filed under In the news

Poverty, marriage, and single mother update

With the annual Census report on poverty out, here are two quick updates.

First, updating this post, the share of all poverty (using official rates) found in single-mother families remains lower than it was from 1974 to 2000. Since 1995, as the poverty rate has gone up and down between 10 and 15 percent, the share of poor people in single-mother families has fallen. As of 2015, 34% of poor people are found in single-mother families.


Marriage has declined, and single motherhood has increased, but that has not produced a poverty population more dominated by single-mother families. Of course these families are more likely to be poor than married-couple families, but they’re not the main poverty story.

Second, updating this post a little, it’s important to keep two major trends in the back of your mind when thinking about social change. The first is that marriage has declined precipitously since 1960. It’s unremitting decline is one of the major social facts of our time. The other trend to keep in mind is that poverty rates fell a lot after the 1960s, but since then have bounced around at an atrocious 10-15%. Now try to keep them both in mind at once: marriage falls, poverty goes up and down. This year’s update puts those together (sorry people who hate this kind of figure), as change in the percentage of women married, and change in the percentage of the population poor.


For a recent op-ed on poverty and marriage, here’s the unpaywalled version of my essay in the Washington Post‘s Post Everything.

Leave a comment

Filed under Uncategorized

On Asian-American earnings

In a previous post I showed that generalizations about Asian-American incomes often are misleading, as some groups have above-average incomes and some have below-average incomes (also, divorce rates) and that inequality within Asian-American groups was large as well. In this post I briefly expand that to show breakdowns in individual earnings by gender and national-origin group.

The point is basically the same: This category is usually not useful for economic statistics, and should usually be dropped for data on specific groups when possible.

Today’s news

What’s new is a Pew report by Eileen Patten showing trends in race and gender wage gaps. The report isn’t focused on Asian-American earnings, but they stand out in their charts. This led Charles Murray, who is fixated on what he believes is the genetic origin of Asian cognitive superiority, to tweet sarcastically, “Oppose Asian male privilege!” Here is one of Pew’s charts:


The figure, using the Current Population Survey (CPS), shows Asian men earning about 14.5% more per hour than White men, and Asian women earning 11% more than White women. This is not wrong, exactly, but it’s not good information either, as I’ll argue below.

First a note on data

The CPS data is better for some labor force questions (including wages) than the American Community Survey, which is much larger. However, it’s too small a sample to get into detail on Asian subgroups (notice the Pew report doesn’t mention American Indians, an even smaller group). To do that I will need to activate the ACS, which is better for race/ethnic detail.

As a reminder, this is the “race” question on the 2014 American Community Survey, which I use for this post:


There is no “Asian” or “Pacific Islander” box to check. So what do you do if you are thinking, “I’m Asian, what do I check?” The question is premised on that assumption that is not what you’re thinking. Instead, you choose from a list of national origins, which the Census Bureau then combines to make “Asian” (the first 7 boxes) and “Pacific Islander” (the last 3) categories. And you can check as many as you like, which is good because there’s a lot of intermarriage among Asians, and between Asians and other groups (mostly Whites). This is a lot like the Hispanic origin question, which also lists national origins — except that question is prefaced by the unifying phrase, “Is Person 1 of Hispanic, Latino, or Spanish origin?” before listing the options, each beginning with “Yes”, as in “Yes, Cuban.”

Although changes have not been announced, it is likely that future questions will combine the race and Hispanic-origin questions, and also preface the Asian categories with the umbrella term. This may mark the progress of getting Asian immigrants to internalize the American racial classification system, so that descendants from groups that in some cases have centuries-old cultural differentiation start to identify and label themselves as from the same racial group (who would have put Pakistanis and Japanese in the same “race” group 100 years ago?). It’s hard to make this progress, naturally, when so many people from these groups are immigrants — in my sample below, for example, 75% of the full-time, year-round workers are foreign-born.


The problem with the earnings chart Pew posted, and which Charles Murray loved, is that it lumps all the different Asian-origin groups together. That is not crazy but it’s not really good. Of course every group has diversity within it, so any category masks differences, but in my opinion this Asian grouping is worse in that regard than most. If someone argued that all these groups see themselves as united under a common identity that would push me in the direction of dropping this complaint. In any event, the diversity is interesting even if you don’t object to the Pew/Census grouping.

Here are two breakouts. The first is immigration. As I noted, 75% of the full-time, year-round workers (excluding self-employed people, like Pew does) with an Asian/Pacific Islander (Asian for short) racial identification are foreign born. That ranges from less than 4% for Hawaiians, to around 20% for the White+Asian multiple-race people, to more than 90% for Asian Indian men. It turns out that the wage advantage is mostly concentrated among these immigrants. Here is a replication of the Pew chart using the ACS data (a little different because I had to use FTFY workers), using the same colors. On the left is their chart, on the right is the same data limited to US-born workers.


Among the US-born workers the Asian male advantage is reduced from 14.5% to 4.2% (the women’s advantage is not much changed; as in Pew’s chart, Hispanics are a mutually exclusive category.) There are some very high-earning Asian immigrants, especially Indians. Here are the breakdowns, by gender, comparing each of the larger Asian-American groups to Whites:


Seven groups of men and nine groups of women have hourly earnings higher than Whites’, while nine groups of men and seven groups have women have lower earnings. In fact, among Laotians, Hawaiians, and Hmong, even the men earn less than White women. (Note, in my old post, I showed that Asian household incomes are not as high as they look when they are compared instead with those of their local peers, because they are concentrated in expensive metropolitan markets.)

Sometimes when I have a situation like this I just drop the relatively small, complex group, which leads some people to accuse me of trying to skew results. (For example, I might show a chart that has Blacks in the worst position, even though American Indians have it even worse.)

But generalization has consequences, so we should use it judiciously. In most cases “Asian” doesn’t work well. It may make more sense to group people by regions, such as East-, South-, and Southeast Asia, and/or according to immigrant status.


Filed under In the news

Update: Adjusted divorce risk, 2008-2014

Quick update to yesterday’s post, which showed this declining refined divorce rate for the years 2008-2014:

On Twitter Kelly Raley suggested this could have to do with increasing education levels among married people. As I’ve reported using these data before, there is a much lower divorce risk for people with BA degrees or higher education.

Yesterday I quickly (but I hope accurately) replicated my basic model from that previous paper, so now I can show the trend as a marginal effect of year holding constant marital duration (from year of marriage), age, education, race/ethnicity, and nativity.*

2014 update

This shows that there has been a decrease in the adjusted odds of divorce from 2008 to 2014. You could interpret this as a continuous decline with a major detour caused by the recession, but that case is weaker than it was yesterday, looking at just the unadjusted trend.

If it turns out that increase in 2010-2012 is related to the recession, it’s not so different from my original view — a recession drop followed by rebound, it’s just that the drop is less and the rebound is more, and took longer, than I thought.  In any event, this should undermine any effort to resuscitate the old idea that the recession caused a decline in divorce by causing families to pull together during troubled times.

This does not contradict the results from Kennedy and Ruggles that show age-adjusted divorce rising between 1980 and 2008, since I’m not trying to compare these ACS trends with the older data sources. For time beyond 2008, they wrote in that paper:

If current trends continue, overall age-standardized divorce rates could level off or even decline over the next few decades. We argue that the leveling of divorce among persons born since 1980 probably reflects the increasing selectivity of marriage.

That would fit the idea of a long-term decline with a stress-induced recession bounce (with real-estate delay).

Alternative interpretations welcome.

* This takes a really long time for Stata to compute on my sad little public-university computer because it’s a non-linear model with 4.8 million cases – so please don’t ask for a lot of different iterations of this figure. I don’t have my code and output cleaned up for sharing, but if you ask me I’ll happily send it to you.


Filed under In the news