The Pew Research Center last week released a lengthy research report on Asians in the U.S., titled “The Rise of Asian Americans.” It combines information from the Census and government sources with the results of Pew’s own national survey of attitudes and opinions.
The report has lots of good information, but there are some thorny problems here. I’ll describe a few problems, then offer one data exercise to help clarify. This gets technical and it’s long, so I will give you the substantive conclusion at the top:
- Because Asians are a diverse category made up of groups with very different profiles, and their household composition and geographic distribution vary by national origin group, generalizations are often unhelpful.
- Among the 10 largest Asian groups, five (Japanese, Indian, Chinese, Filipino, Korean) are above average in income and five (Vietnamese, Pakistani, Laotian, Cambodian, Hmong) are below. But all 10 Asian groups are doing better compared to the national average than they are compared to the average incomes in the places they live — they are richer nationally than they are locally.
- The amount of income inequality within Asian groups varies as well. Pakistanis, Chinese, Koreans and Indians have the highest levels of inequality, while Filipinos and Laotians have low levels of inequality.
But first: Who is Asian? On the Census questionnaire, Asian is not exactly a category – rather, the category is created from all the responses of people who specify Asian national origins in the race question. To refresh, this is the question:
So “Asian” is all the people who specify Asian Indian, Chinese, Filipino, Japanese, Korean, Vietnamese or “Other Asian.” (The right-hand column is for Pacific Islanders.) Yes, in the U.S., Hispanic/Latino national origins are “ethnicities,” but Asian national origins are “races.” Go figure.
That lack of a common definition is compounded by two factors: First, there is so much diversity among Asians that the using a single category is as challenging statistically as it is politically. And second, Asians – as the Pew report shows – have a high rate of intermarriage with Whites, as well as (among some groups) across Asian national-origin lines. As a result, some Asian groups have high rates of “multiple-race identification” — especially those whose immigration was generations ago.
The controversy over the Pew report is summarized in this Color Lines story and this response from the Asian American / Pacific Islander Policy Research Consortium. The gist of it is that the report was too rosy in its description of Asian advantages and too homogenizing in its treatment of Asian diversity – as a result repeating the “divisive trope” of the “model minority.” Here’s part of the summary from the New York Times:
Drawing on Census Bureau and other government data as well as telephone surveys from Jan. 3 to March 27 of more than 3,500 people of Asian descent, the 214-page study found that Asians are the highest-earning and best-educated racial group in the country.
Among Asians 25 or older, 49 percent hold a college degree, compared with 28 percent of all people in that age range in the United States. Median annual household income among Asians is $66,000 versus $49,800 among the general population.
In the survey, Asians are also distinguished by their emphasis on traditional family mores. About 54 percent of the respondents, compared with 34 percent of all adults in the country, said having a successful marriage was one of the most important goals in life; another was being a good parent, according to 67 percent of Asian adults, compared with about half of all adults in the general population.
Asians also place greater importance on career and material success, the study reported, values reflected in child-rearing styles. About 62 percent of Asians in the United States believe that most American parents do not put enough pressure on their children to do well in school.
Did Pew homogenize or glorify too much? I don’t know. Here’s a graph from the report, which shows that Asian groups differ, but they all have higher-than-average household incomes:
The Color Lines story quotes Deepa Iyer, head of the National Council of Asian Pacific Americans and executive director of South Asian Americans Leading Together:
The danger in framing the study the way Pew did, and the way the media picked up on it, is that folks who are in the general public and institutional stakeholders and policy makers might get the impression that they don’t necessarily need to dig deep into our communities to understand any sort of disparities that exist.
The problem of homogenizing Asians is longstanding in American sociology. In most data analyses, the Asian sample is small to begin with, so they are often collapsed into one category (which I’ve done) or dropped from the story (which I’ve also done, angering some readers). Here is a typical passage, from a 2001 article by Leslie McCall:
That didn’t stop her (or lots of other people) from extensively analyzing Asians as a combined group, and offering speculation on her results.
There are other examples. In my experience, Jen’nan Read and I broke out six Asian groups for a study of women’s employment with the 2000 Decennial Census data — which reinforced my conviction that disaggregating is best. (This 2010 Census report gives some detail on more than 20 national-origin groups.)
Some new numbers
Anyway, I’ve got four specific issues to address with Pew’s comparison of household incomes (some of which they acknowledge in the report): a) Household composition differs between groups (more or fewer kids, grandparents); b) Asians disproportionately live in parts of the U.S. with high costs of living (like Hawaii and California, and urban areas generally); c) different members of a household might have different “race” identities (so, a Korean man married to a Chinese woman might define their child is either or both); and d), levels of inequality differ between groups, so central tendency comparisons don’t capture the whole story.
In this exercise I address these problems. I adjust for household size and composition, count individuals’ own “race” rather than imposing a single identity on the household, compare incomes to the average in the local metropolitan area as well as the national average, and compare levels of within-group inequality.
All in one blog post! Someone might want to work this up into a real paper (and maybe someone else already has? The last time I really read about this was more than 10 years ago.) So I’m just offering this approach as a suggestion, and making my code available if anyone wants to pursue it (see below).
I use the 2006-2010 combined American Community Survey, from IPUMS, for maximum recent sample size. This is about 15 million people, and the Asian samples range from about 160,000 Chinese to 7,500 Laotians. I identify individuals according to their individual “race.”
I calculate their incomes as per capita household income, adjusted for economies of scale. To do that, I count adults as 1 person, kids under 18 as .7 of a person, and divide the total household income by that count to the power of .65 for economies of scale (see here for details). Then I take the natural log of all that to pull in the right tail of the distribution (so the mean isn’t pulled up by the ~1%). When I’m done, everyone in the household has the same income, and the distribution is pretty normal. Nice!
To see what this does: The mean household income for individuals in the country in 2006-2010 is $79,174, and the natural log of the composition-and-scale adjusted per capita income is 10.26 (see figure), which works out to $28,439. In comparison, the logged incomes for Asians range from 10.6 (~$40,000) for Indians and Japanese, down to 9.7 (~$16,000) for Hmong.
To deal with the issue of living in expensive areas, I take the mean of that logged income in each metropolitan area, and compare each person’s own per capita income to that. So a score of 0 means you have the average income in your area — more than 0 means richer than average, less than zero is poorer.
There is not one correct answer about how to do this: Having an average income in a rich area still means you can buy more stuff on Amazon than someone with a lower absolute income. But it might also mean having a smaller house, or not being considered rich by your neighbors. On the third hand, if a rich family moves to a rich area, we shouldn’t feel sorry for them for not being above average in their neighborhood. For your consideration, I show the incomes compared with the national average and with the local metro mean, for the 10 largest Asian groups (click for higher resolution):
To interpret the figure, you can see that Japanese and Indians are about 0.36 higher in log dollars than the national average but only 0.26 higher than their metro-area averages. On the downside, Hmong individuals have adjusted per capita incomes of 0.58 less than the national average, but 0.63 less than their local average.
Higher-than-average-income Japanese, Indians, Filipinos and Chinese are about 73% of the total; Koreans are about average, and the lower-than-average groups are 17% of the total. By this method, then, a big majority of Asians in the U.S. belong to above-local-average income groups, but a substantial fraction are well below average. And they are all doing worse relative to their metro area neighbors than they are to the national average.
Notice how it’s different from the Pew figure. In that, Vietnamese households had higher incomes than Koreans, and both were above the national average. Here Koreans are doing substantially better, mostly as a result of the household size adjustments. Also, the smaller groups I show – the ones Pew did not detail in that figure – are the poorer ones. And they are also doing worse locally relative to their national position.
Finally, consider the inequality within groups. Without doing a full-blown analysis of this, I can show the importance of the question with a simple box-and-whisker plot. This shows the distribution of income — adjusted as described above for household composition and size — for each group, including non-Asians for comparison.
The graph shows a lot of information in a small space:
- The line through the middle of each box is the median, or mid point, of each income distribution.
- The blue + sign is the mean. The further the mean is above the median, the more rich people there are pulling the mean up.
- The top and bottom of the boxes are the 75th and 25th percentiles. The further apart they are, the greater the income gap between top and bottom.
(The top whiskers, which can be used to show the highest point in each distribution, aren’t shown here, because they’re so far away it would make the graph unreadable.)
As I mentioned at the top, the graph shows that Pakistanis and Chinese, and to a lesser extent Koreans and Indians, have high levels of inequality — their + signs are far from their median lines, and their 75/25 spreads are large. On the other hand, Filipinos, Laotians and Hmong have much narrower spreads.
Practically speaking, all this means that some groups are misrepresented by measures of the overall status of “Asians,” especially the smaller, poorer groups. And further, that generalizing will represent some groups worse than others because of their internal diversity. For example, the average Chinese American is quite a bit richer than the average non-Asian American, but the poorest 25% of Chinese are not much better off than the poorest 25% of the population at large.
Like I said, just an idea, with a few examples.
Take it away
Feel free to do it more, and/or better, yourself. Here’s my SAS code. Please credit me if it works, but don’t blame me if it’s wrong. This has not been peer-reviewed – it’s rough work product. Send any corrections written on the back of a $20-bill. (Everyone else: You can stop reading now!)
Just get these variables from IPUMS:
SERIAL METAREA HHINCOME PERWT AGE RACED
And then do this to them:
/* exclude households with no income */ if hhincome>0; /* this codes folks into this scheme, with Asians from richest to poorest: 0="Not Asian" 1= "Japanese" 2= "Indian" 3= "Filipino" 4= "Chinese" 5= "Korean" 6= "Vietnam" 7= "Pakistani" 8= "Laotian" 9= "Cambodian" 10= "Hmong" 11= "OtherA" 12= "twoplusA" */ /* these codes refer to RACED, the detailed race variable on IPUMS */ /* Count asians as those who are asian alone, multiple asian, asian and white, asian and PI, or white-asian-PI */ asian=0; if raced in (400 410 420 811 861 911) then asian=4; if raced in (610 814) then asian=2; if raced in (600 813 864 865 914) then asian=3; if raced in (640 816) then asian=6; if raced in (620 815) then asian=5; if raced in (500 812) then asian=1; if raced in (660) then asian=9; if raced in (661) then asian=10; if raced in (662) then asian=8; if raced in (669) then asian=7; if raced in ( 663 664 665 666 667 668 670 671 672 810 817 818 860 867 868 910 915) then asian = 11; if raced in ( 673 674 675 676 677 678 679 819 869) then asian = 12; /* so the variable labels display in output */ format METAREA METAREA_f. ASIAN asian. ; /* add the decimal to the weight variable */ format PERWT 11.2; run; /* this counts up the number of kids and adults in each household */ proc sort data=temp; by serial; run; data hh; set temp (keep=serial age); by serial; if first.serial then do; kids=0; adults=0; end; retain kids adults; if age le 18 then do; kids=kids+1; end; if age gt 18 then do; adults=adults+1; end; keep serial kids adults; if last.serial; run; proc sort data=hh; by serial; run; /* this merges in those people counts, and then calculates the household income variable */ data people; merge temp hh; by serial; equiv = hhincome/((adults+(.7*kids))**.65); lnequiv = log(hhincome/((adults+(.7*kids))**.65)); run; /* this outputs the mean logged household equivalent income for each metro area (with non-metro folks as 0 */ proc means noprint data=people; var lnequiv; class metarea; weight perwt; output out=msa mean=msaequiv; run; proc sort data=msa; by metarea; run; proc sort data=people; by metarea; run; /* this merges in the metro area variable and calculates the income-difference variable */ data merged; merge people (in=a) msa; by metarea; if a; relhhinc = lnequiv-msaequiv; run; /* Distribution of the logged income variable */ proc univariate data=merged; var lnequiv; run; proc univariate data=merged; var lnequiv; class asian; run; /* Boxplots */ proc sort data=merged; by asian; run; title 'Income distributions, household composition- and scale-adjusted'; proc boxplot data=merged; plot equiv*asian / clipfactor = 1.5 grid; where asian le 10; run; title; /* National income means */ proc means mean data=merged; var lnequiv; weight perwt; run; /* National asian income means by group */ proc means mean missing data=merged; var lnequiv; class asian; weight perwt; run; /* Relative income for each Asian group, for metro people only */ proc means mean; var relhhinc; class asian; weight perwt; where asian >0 and metarea>0; run;