What is ‘nationally representative,’ and did Regnerus have it?

I’m off to Minneapolis to present a talk tomorrow on “The Regnerus Affair” at the Minnesota Population Center, subtitle: “Gay Marriage, the Supreme Court, and the Politics of Sociology.”

In my preparation, I was putting together notes from previous posts, the critique I co-authored with Andrew Perrin and Neal Caren, the infamous paper itself, and the media coverage of the scandal. And one piece of it I never really questioned got me thinking: his insistence that his dataset was a “a random, nationally-representative sample of the American population.” The news media repeated this assertion routinely, but what does it mean?

The data, collected by Knowledge Networks, are definitely not truly random. But not much is. They have standing panel of participants who get rewards for participating in a certain number of online surveys. The recruitment of the original panel is where the randomness comes in, with dialing (more or less) random phone numbers. But who chooses to be in it is not random, of course. What the firm does, then, is apply weights to the sample. That is, you don’t count each person as one person, you count them as a certain multiple of a person, so that the weighted total sample looks like the target population — in this case all noninstitutionalized American adults ages 18-39.

In the paper, Regnerus offers an appendix which compares his New Family Structures Study to the national population as represented in better, larger samples, such as the Current Population Survey (CPS). He writes:

Appendix A presents a comparison of age-appropriate summary statistics from a variety of socio-demographic variables in the NFSS, alongside the most recent iterations of the Current Population Survey, the National Longitudinal Study of Adolescent Health (Add Health), the National Survey of Family Growth, and the National Study of Youth and Religion—all recent nationally-representative survey efforts. The estimates reported there suggest the NFSS compares very favorably with other nationally-representative datasets.

So, he eyeballs the comparisons and determines the result is “very favorable.” I had previously eyeballed the first few rows of that table and reached the same conclusion. This is the distribution of age, race/ethnicity, region and sex from that table:

nfss comparisonsSo, it looks very similar to the national population as counted by the benchmark CPS. But both of these surveys are weighted on these factors. That is, after the sample is drawn, they change the counts of people to make them match what we know from Census data (which are weighted, too, incidentally). So the fact that NFSS matches CPS on this characteristics just means they did the weights right, so far.

Think about it this way. If I collect data on 6 men and 4 women, it’s easy to call my data “representative” if I weight those 6 men by .83 and the 4 women by 1.25. The more variables you try to match on the harder the math gets, but the principle is the same.

But now I looked further down the table, and Regnerus’s data don’t compare “very favorably” to the national data on some other variables. Here are household income (from CPS) and self-reported health (from the National Survey of Family Growth):


nfss-healthThis means that, when you apply the weights to the NFSS data, which produces comparable distributions on age, sex, race/ethnicity and region, you get a sample that is quite a bit poorer and less healthy than the national average as represented by the better surveys.

I was confused by this partly because according to the Knowledge Networks documentation on the NFSS, income was one of the weighting variables.

I don’t know how big an issue this is. Do you? And do you know of a standard by which a researcher or research firm can declare data “nationally representative” in this age of small, fast, low-response, online surveys?


8 thoughts on “What is ‘nationally representative,’ and did Regnerus have it?

  1. Not that I’m an expert on survey weighting, but my way of judging online panel (e.g. YouGov) surveys is to have a look and see how much the measure you’re interested in varies across different population groups. If it doesn’t vary much, then probably a perfect sample (or census) would give similar results. If the measure is really different among different groups, then this kind of thing matters a lot.

    Likewise, if I’m looking at something relatively rare where non-responders are going to be very different to responders (e.g. injecting heroin), then even a perfectly weighted sample with a high response rate is likely to produce really biased results.

    A couple of useful papers here are Groves & Peytcheva Public Opin Q 2008;72:167-189 (meta-analysis of response rate vs. bias), and my particular favourite, Groves et al 2006’s (doi: 10.1093/poq/nfl036) experiment on non-response bias, which (to my mind) convincingly shows that you can’t just think about the sample representativeness in the abstract, you have to think about what the measure is, what the survey topic is, and what sort of non-response bias you’re likely to get.

    Not sure if this helps. But really interested in the whole Regenerus debate, so keen to hear how the paper & discussion go.


  2. If you look at the Distribution tables by state, and that was hard to do the way he reported it in the Code book it made it hard for you to realize it. But if you put the States in one Column and the number of respondents in another column you get a lot of respondents in California. Here I’ll try and copy paste
    Column 1 was Position in report
    Column 2 was some numbering system I don’t know what it is
    Column 3 was number of respondents
    Column 4 was Percent of total
    Column 5 is cumulative percent

    49 93 CA 378 12.65% 12.65%
    38 74 TX 227 7.60% 20.25%
    7 21 NY 161 5.39% 25.64%
    30 59 FL 143 4.79% 30.42%
    12 33 IL 139 4.65% 35.07%
    9 23 PA 137 4.59% 39.66%
    10 31 OH 119 3.98% 43.64%
    13 34 MI 116 3.88% 47.52%
    14 35 WI 85 2.84% 50.37%

    Here were the low end States
    40 82 ID 11 0.37% 94.78%
    51 95 HI 9 0.30% 95.08%
    39 81 MT 9 0.30% 95.38%
    22 51 DE 7 0.23% 95.62%
    3 13 VT 7 0.23% 95.85%
    24 53 DC 5 0.17% 96.02%
    18 44 ND 5 0.17% 96.18%
    5 15 RI 4 0.13% 96.32%
    41 83 WY 2 0.07% 96.39%

    I think I copied and pasted so I don’t know how this happened but if you actually put the numbers (number of respondents by State) in Excel and add them up there were only 2,880 Respondents, not 2,988 as he reports. I couldn’t figure that out. Maybe I just did a bad copy paste?

    It’s funny how you are now just mentioning how the NFSS sample is disproportionately in the really poor category, I wrote numerous comments about that. Here is how I analyzed that
    The Column with 11.9% in it is the NFSS data, the column with 5.7% is the Census
    Under 10,000 11.9 5.7
    10,000 – 19,999 9.2 7.4
    20,000 – 29,999 10.5 9.5
    30,000 – 39,999 9.6 9.4
    40,000 – 49,999 9.9 9.1
    50,000 – 74,999 19.2 20.3
    75,000 or more 29.8 38.6
    100.1 100

    This has to be because if you keep taking their surveys they give you a free laptop and internet service if you don’t have internet. Then 11% of the respondents, who either their mother or their father had a same sex romance that their now adult child is recalling, 11% of the sample is from “Withdrawn” respondents. People the data company had previously removed from their pool. These people were offered $20 to complete the survey. (The 11% figure comes from the Sherkat audit)

    What nobody has done and what I would like to see, are what do the records look like if you remove the “dirty data records” The wild answers, example- people who had sex 100 times in the last two weeks etc. Since I am not a scholar I do not have access to these records. I really wish somebody would clean the data and then run the numbers on the cleaned data. I am just curious as to what that would look like. There should be some column in the data file that indicates if the respondents are on the free laptop and internet plan, and the respondents who got paid the $20. I would be analyzing those records.

    I hope my comment does not take up to much room on your website.


  3. Oh here is another thing. In Regnerus’ rebuttal paper he says, “Oh I removed just under 400 records in the Step-Family category” Then he re-ran the numbers into more categories, more family structures. BUT he didn’t just remove records from the step-parent category, he removed I think it was 70 records, from the single parent category as well. He removed 70 records from the single parent category but never told that he did that in his report. In his report he says he removed approx 400 records from the step-parent ONLY category.

    See by removing records he made it impossible to catch him, comparing his numbers in his first report to his numbers in his second report. He created again, what he is now infamous for, he created a second apples to oranges comparison. He used two different data sets so now you could NOT make a valid comparison between his first and his second study. He should have kept the same data set and in his second report go ahead and make more numerous family structures. No explanation was given for why he removed the almost 400 records in the second report. But the result is, no one is able to compare between the two reports. And what is published, what is peer review published is the FIRST Report.


  4. It’s simple; there are not enough young adults raised by gay parents in the total population to be reached by screening only 15,000 people.


Comments welcome (may be moderated)

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s