Tag Archives: gss

Wilcox and colleagues plagiarized my work in the New York Times

In the New York Times yesterday, W. Bradford Wilcox, Jason S. Carroll and Laurie DeRose published an Op-Ed with the ridiculous title, “Religious Men Can Be Devoted Dads, Too.” In it they included this figure:

bwnyt

In 2015 I wrote a post titled, “That thing about Republican marriages being happier (isn’t true),” which included this figure:

marital-happiness-partyid.xlsx

There are trivial differences between these figures. Theirs is from the General Social Survey for 2010-2018, mine was for 2010-2014. Theirs used political views while mine used party identification. Theirs is just women, and controls for age, education, and race; mine included men and women while controlling for gender, and I also controlled for income and religious attendance. (And they used gray for the middle bar, instead of purple.) However, in a subsequent post, from 2017, I redid the analysis for the years 2012-2016, using political views instead of party identification, in a post titled, “Who’s happy in marriage? (Not just rich, White, religious men, but kind of).” The results are almost identical to theirs in the Times (on the right, here):

hapmar16c

Did they know about my pieces? I am certain they did, though I can’t prove it. It’s relevant that my first post, “That thing about Republican marriages…” was a critique of a post by Wilcox and Nick Wolfinger, which had only reported that Republicans were slightly happier in marriage than Democrats, which they called “The Republican Advantage in Marital Satisfaction.” My post was a correction, showing the U-shape the emerged when you broke out the categories — the change Wilcox and colleagues have now adopted. My follow-up post was reported by Bloomberg (and carried in the Chicago Tribune), and the Daily Mail. Both of my posts were tweeted by popular journalists who work in this area. I expect that would claim they never noticed my little blog posts.

You also could split hairs on the definition of plagiarism to try to defend this unethical behavior. The relevant passages of the American Sociological Association Code of Ethics:

(b) In their publications, presentations, teaching, practice, and service, sociologists provide acknowledgment of and reference to the use of their own and others’ work, even if the work is paraphrased and not quoted verbatim.
(c) While sociologists utilize and build on the concepts, theories, and paradigms of others, they may not claim credit for creating such ideas and must cite the creator of such ideas where appropriate.

But no one can seriously argue they shouldn’t have referenced my work.

Wilcox has done much worse, of course, most importantly leading a conspiracy to gin up research to turn the Supreme Court against same-sex marriage and then lying about his role in that conspiracy (the subject of a chapter in my book Enduring Bonds). And this is not a very important idea (their explanation is very flimsy, and I have no real explanation or theory to explain the pattern.) But this one goes on the list somewhere.

Why?

Why do I care? Is this just petty partisanship and even jealousy because Wilcox paid himself $80,000 of right-wing foundation money in 2016, and continues to publish low-quality research in important outlets like the New York Times? Draw your own conclusions. Of course his views are noxious to me. But more than that, in the game of trust that is the research ecosystem, reputations matter a lot. Once someone is tenured, and funded by unaccountable political actors, our options for defending the system are limited. The norms of publishing, especially outside academia, don’t require research transparency (like their current report, made to order for conservative funders, not the research community or peer review). If someone says, “This is my finding,” publishers (like the Times) usually vet the researcher instead of the research. 

I don’t believe in lifetime bans, and I don’t care about atonement for research ethics. My question is, “Can we trust this person’s research?” Before we can answer that affirmatively, we need to have an accounting of past malfeasance that makes clear future work will be clean. Until then, I don’t mind spending a few minutes now and then reminding people that Wilcox (like Mark Regnerus) is not trustworthy.

5 Comments

Filed under Me @ work

Fertility rate implications explained

(Sorry for the over-promising title; thanks for the clicks.)

First where we are, then projections, with figures.

For background: Caroline Hartnett has an essay putting the numbers in context. Leslie Root has a recent piece explaining how these numbers are deployed by white supremacists (key point: over-hyping the downside of lower fertility rates has terrible real-world implications).

Description

The National Center for Health Statistics released the 2018 fertility numbers yesterday, showing another drop in birth rates, and the lowest fertility since the Baby Boom. We are continuing a historical process of moving births from younger to older ages, which shows up as fewer births in the transition years. I illustrate this each year by updating this figure, showing the relative change in birth rates by age since 1989:

change in birthrates by age 1989-2016.xlsx

Historically, postponement was associated with reduction in lifetime births — which is what really matters for population trends. When people were having lots of children, any delay reduced the total number. With birth rates around two per woman, however, there is a lot more room for postponement — a lot of time to get to two. (At the societal level, both reduction and postponement are generally good for gender equality, if women have good health and healthcare.)

This means that drops in what we demographers call “period” fertility (births right now) are not the same as drops in “completed” fertility (births in a lifetime), or falling population in the long run. The period fertility measure most often used, the unfortunately named total fertility rate (TFR), is often misunderstood as an indicator of how many children women will have. It is actually how many births they are having right now, expressed in lifetime terms (I describe it in this video, with instructions).

Lawrence Wu and Nicholas Mark recently showed that despite several periods of below “replacement” fertility (in terms of TFR), no U.S. cohort of women has yet finished their childbearing years with fewer than two births per woman. Here is the completed fertility of U.S. women, by year of birth, as recorded by the General Social Survey. By this account, women born in the early 1970s (now in their late-forties by 2018) have had an average of 2.3 children.

Stata graph

Whether our streak of over-two completed fertility persists depends on what happens in in the next few years (and of course on immigration, which I’ll get to).

Last year at this time I summed up the fertility situation and concluded, “sell stock now,” because birth rates fell for women at all ages except over 40. That kind of postponement, I figured, based on history, reflected economic uncertainty and thus was an ill omen for the economy. The S&P 500 is up 5% since then, which isn’t bad as far as my advice goes. And I’m still bearish based on these birth trends (I bet I’ll be right before fertility increases).

Projection

It is very hard to have an intuitive sense of what demographic indicators mean, especially for the future. So I’ve made some projections to show the math of the situation, to get the various factors into scale. My point is to show what the current (or future) birth rates imply about future growth, and the relative role of immigration.

These projections run from 2016 to 2100. I made them using the Census Bureau’s Demographic Analysis and Population Projection System software, which lets me set the birth, death, and migration rates.* I started with the 2016 population because that’s the most recent set of life tables NCHS has released for mortality. Starting in 2018 I apply the current age-specific birth rates.

First, the most basic projection. This is what would happen if birth rates stayed the same as those in 2018 and we completely cut off all immigration (Projection A), or if we had net migration running at the current level of just under +1 million each year, using Census estimates for age and sex of the migrants (Projection B).

projections.xlsx

From the 2016 population of 323 million, if the birth rates by age in 2018 were locked in, the population would peak at 329 million in 2029 and then start to decline, reaching 235 million by 2100. However, if we maintain current immigration levels (by age and sex), the population would keep growing till 2066 before tapering only slightly. (Note this assumes, unrealistically, that the immigrants and their children have the same birth rates as the current population; they have generally been higher.) This the most important bottom line: there is no reason for the U.S. to experience population decline, with even moderate levels of immigration, and assuming no rebound in fertility rates. Immigration rates do not have to increase to maintain the current population indefinitely.

Note I also added the percentage of the population over age 65 on the figure. That number is about 16% now. If we cut off immigration and maintain current birth rates, it would rise to 25% by the end of the century, increasing the need for investment in old age stuff. If we allow current migration to continue, that growth is less and it only reaches 23%. This is going up no matter what.

To show the scale of other changes that we might expect — again, not predictions — I added a few other factors. Here are the same projections, but adding a transition to higher life expectancies by 2080 (using Japan’s current life tables; we can dream). In these scenarios, population decline is later and slower (and not just at older ages, since Japan also has lower child mortality).

projections.xlsx

Under these scenarios, with rising life expectancies, the old population rises more, to between 27% and 29%. Generally experts assume life expectancies will rise more than this, but that’s the assumed direction (now, unbelievably, in doubt).

Finally, I’ve been assuming birth rates will not fall further. If what we’re seeing now is fertility postponement, we wouldn’t expect much more decline. But what if fertility keeps falling? Here is what you get with the assumptions in Projection D, plus total fertility rates falling to 1.6, either by 2030 or 2050. As you can see, in the 1.6 to 1.8 range, the effects on population size aren’t great in this time scale.

projections.xlsx

Conclusion: We are on track for slowing population growth, followed by a plateau or modest decline, with population aging, by the end of the century, and immigration is a bigger question than fertility rates, for both population growth and aging.

Perspective

In a global context where more people want to come here than want to leave (to date), worrying about low birth rates tends to lend itself to myopic, religious, or racist perspectives which I don’t share. I don’t think American culture is superior, whites are in danger of extinction, or God wants us to have more children.

I do not agree with Dowell Myers, who was quoted yesterday as saying, “The birthrate is a barometer of despair.” That even as some people are having fewer children than they want, or delaying childbearing when they would rather not. In the most recent cohort to finish childbearing, 23% gave an “ideal number of children for a family to have” that was greater than the number they had, and that number has trended up, as you can see here:

Stata graph

Is this rising despair? As individuals, people don’t need to have children any more. Ideally, they have as many as they want, when they want, but they are expensive and time consuming and it’s not surprising people end up with fewer than they think “ideal.” Not to be crass about it, but I assume the average person also has fewer boats than they consider ideal.

And how do we know what is the right level of fertility for the population? As Marina Adshade said on Twitter, “Did women actually have a desire for more children in the past? Or did they simply lack the bargaining power and means to avoid births?”

However, to the extent that low birth rates reflect frustrated dreams, or fear and uncertainty, or insufficient support for families with children, of course those are real problems. But then let’s name those problems and address them, rather than trying to change fertility rates or grow the population, which is a policy agenda with a very bad track record.


* I put the DAPPS file package I created on the Open Science Framework, here. If you install DAPPS you can open this and look at the projections output, with graphs and tables and population pyramids.

3 Comments

Filed under In the news

Do rich people like bad data tweets about poor people? (Bins, slopes, and graphs edition)

Almost 2,000 people retweeted this from Brad Wilcox the other day.

bradpoorstv

Brad shared the graph from Charles Lehman (who noticed later that he had mislabeled the x-axis, but that’s not the point). First, as far as I can tell the values are wrong. I don’t know how they did it, but when I look at the 2016-2018 General Social Survey, I get 4.3 average hours of TV for people in the poorest families, and 1.9 hours for the richest. They report higher highs (looks like 5.3) and lower lows (looks like 1.5). More seriously, I have to object to drawing what purports to be a regression line as if those are evenly-spaced income categories, which makes it look much more linear than it is.

I fixed those errors — the correct values, and the correct spacing on the x-axis — then added some confidence intervals, and what I get is probably not worth thousands of self-congratulatory woots, although of course rich people do watch less TV. Here is my figure, with their line (drawn in by hand) for comparison:

tvfaminc-bradcharles

Charles and Brad’s post got a lot of love from conservatives, I believe, because it confirmed their assumptions about self-destructive behavior among poor people. That is, here is more evidence that poor people have bad habits and it’s just dragging them down. But there are reasons this particular graph worked so well. First, the steep slope, which partly results from getting the data wrong. And second, the tight fit of the regression line. That’s why Brad said, “Whoa.” So, good tweet — bad science. (Surprise.) Here are some critiques.

First, this is the wrong survey to use. Since 1975, GSS has been asking people, “On the average day, about how many hours do you personally watch television?” It’s great to have a continuous series on this, but it’s not a good way to measure time use because people are bad at estimating these things. Also, GSS is not a great survey for measuring income. And it’s a pretty small sample. So if those are the two variables you’re interested in, you should use the American Time Use Survey (available from IPUMS), in which respondents are drawn from the much larger Current Population Survey samples, and asked to fill out a time diary. On the other hand, GSS would be good for analyzing, for example, whether people who believe the Bible is the “the actual word of God and is to be taken literally, word for word” watch TV more than those who believe it is “an ancient book of fables, legends, history, and moral precepts recorded by men” (Yes, they do, about an hour more.) Or looking at all the other social variables GSS is good for.

On the substantive issue, Gray Kimbrough pointed out that the connection between family income and TV time may be spurious, and is certainly confounded with hours spent at work. When I made a simple regression model of TV time with family income, hours worked, age, sex, race/ethnicity, education, and marital status (which again, should be done better with ATUS), I did find that both hours worked and family income had big effects. Here they are from that model, as predicted values using average marginal effects.

tv work faminc

The banal observation that people who spend more time working spend less time watching TV probably wouldn’t carry the punch. Anyway, neither resolves the question of cause and effect.

Fits and slopes

On the issue of the presentation of slopes, there’s a good lesson here. Data presentation involves trading detail for clarity. And statistics have both have a descriptive and analytical purpose. Sometimes we use statistics to present information in simplified form, which allows better comprehension. We also use statistics to discover relationships we couldn’t otherwise — such as multivariate relationships that you can’t discern visually. The analyst and communicator has to choose wisely what to present. A good propagandist knows what to manipulate for political effect (a bad one just tweets out crap until they get lucky).

Here’s a much less click-worthy presentation of the relationship between family income and TV time. Here I truncate the y-axis at 12 hours (cutting off 1% of the sample), translate the binned income categories into dollar values at the middle of each category, and then jitter the scatterplot so you can see how many points are piled up in each spot. The fitted line is Stata’s median spline, with 9 bands specified (so it’s the median hours at the median income in 9 locations on the x-axis). I guess this means that, at the median, rich people in America watch about an hour of TV per day less than poor people, and the action is mostly under $50,000 per year. Woot.

gss tv income

Finally, a word about binning and the presentation of data (something I’ve written about before, here and here). We make continuous data into categories all the time, starting from measurement. We usually measure age in years, for example, although we could measure it in seconds or decades. Then we use statistics to simplify information further, for example by reporting averages. In the visual presentation of data, there is a particular problem with using averages or data bins to show relationships — you can show slopes that way nicely, but you run the risk of making relationships look more closely correlated than they are. This happens in the public presentation of data when analysts are showing something of their work product — such as a scatterplot with a fitted line — to demonstrate the veracity of their findings. When they bin the data first, this can be very misleading.

Here’s an example. I took about 1000 men from the GSS, and compared their age and income. Between the ages of 25 and 59, older men have higher average incomes, but the fit is curved with a peak around 45. Here is the relationship, again using jittering to show all the individuals, with a linear regression line. The correlation is .23

c1That might be nice to look at but it’s hard to see the underlying relationship. It’s hard to even see how the fitted line relates to the data. So you might reduce it by showing the average income at each age. By pulling the points together vertically into average bins, this shows the relationship much more clearly. However, it also makes the relationship look much stronger. The correlation in this figure is .65. Now the reader might think, “Whoa.”

c2Note this didn’t change the slope much (it still runs from about $30k to $60k), it just put all the dots closer to the line. Finally, here it is pulling the averages together in horizontal bins, grouping the ages in fives (25-29, 30-34 … 55-59). The correlation shown here is .97.

c3

If you’re like me, this is when you figured out that reducing this to two dots would produce a correlation of 1.0 (as long as the dots aren’t exactly level).

To make good data presentation tradeoffs requires experimentation and careful exposition. And, of course, transparency. My code for this post is available on the Open Science Framework here (you gotta get the GSS data first).

2 Comments

Filed under In the news

Equal-education and wife-more-education married couples don’t have sex less often

In my review of Mark Regnerus’s book, Cheap Sex, I wrote: “The book is an extended rant on the theme, ‘Why buy the cow when you can get the milk for free?’ wrapped in a misogynist theory about sexual exchange masquerading as economics, and motivated by the author’s misogynist religious and political views.”

Someone just reposted an old book-rehash essay of Regnerus’s called, “The Death of Eros.” In it he links to my post documenting the decline in sexual frequency among married couples in the General Social Survey. In marriage, Regnerus writes, “equality is the enemy of eros,” before selectively characterizing some research about the relationship between housework and sex. (Here’s a recent analysis finding egalitarian couples don’t have sex less.)

But I realized I never looked at sexual frequency in married couples by the relative education of the spouses, which is available in the GSS. So here’s a quick take: Married man-woman couples in which the wife has equal or more education don’t have sex less frequently.

I modeled sexual frequency (an interval scale from “not at all” = 0 to “4+ times per week” = 6 as a function of age, age-squared, respondent education, respondent sex, decade, and relative education (wife has lower degree, wife has same degree, wife has higher degree). The result is in this figure. Note the means are between 3 (“2-3 times per month”) and 4 (“weekly”). Stata code for GSS below.

death of eros

OK, that’s it. Here’s the code (I prettied the figure a little by hand afterwards):

*keep married people
keep if marital==1

* with non-missing own and spouse education
keep if spdeg<4 & degree<4
recode age (18/29=18) (30/39=30) (40/49=40) (50/59=50) (60/109=60), gen(agecat)
recode year (1970/1979=1970) (1980/1989=1980) (1990/1999=1990) (2000/2008=2000) (2010/2016=2010), gen(decade)
gen erosdead = spdeg>degree
gen equal=spdeg==degree

gen eros=0
replace eros=1 if spdeg<degree & sex==1
replace eros=2 if spdeg==degree
replace eros=3 if spdeg>degree & sex==1

replace eros=1 if spdeg>degree & sex==2
replace eros=3 if spdeg<degree & sex==2

label define de 1 "wife less"
label define de 2 "equal", add
label define de 3 "wife more", add
label values eros de

reg sexfreq i.sex i.agecat i.decade i.degree i.eros [weight=wtssall]
reg sexfreq i.sex c.age##c.age i.degree i.eros##i.decade [weight=wtssall]
margins i.eros##i.decade
marginsplot, recast(bar) by(decade)

Note: On 25 Dec 2018 I fixed a coding error and replaced the figure; the results are the same.

7 Comments

Filed under Me @ work, Research reports