Genetics is the truther conspiracy of racial inequality

After reading Gideon Lewis-Kraus’s New Yorker profile of Paige Harden (discussing her new book, The Genetic Lottery), but before reading her book, I went back and re-read the piece Harden wrote with Eric Turkheimer and Richard Nisbett in Vox about Charles Murray’s appearance on Sam Harris’s podcast, and listened to her follow-up conversation with Harris about it (subscription required, but they give it to you free if you ask).

In the conversation between Harris and Harden, something Harris said made me think that the genetic explanation for race differences in what they’re calling “outcomes,” such as educational attainment, intelligence, or other phenotypic “complex behavior traits” is like a truther conspiracy theory for people like Harris and Charles Murray. Any time they think they have poked a hole in the “racism explains everything” worldview, they believe it’s evidence for genetics.

Harden takes positions that are at odds with the viewpoint of many sociologists. She believes intelligence is measurable, and heritable (people get it from their parents), and has important causal effects on people’s lives. But she doesn’t believe that explains why Whites score higher than some other groups on IQ tests. In the interview, Harden had patiently explained that in the absence of evidence of group differences in genetics — which there is not — you can’t assume the direction of as-yet undiscovered genetic differences, even when you have group differences in phenotypes and evidence of heritability at the individual level. She said:

“It’s a really, really, really basic statistical point, which is that if you know the direction of the association within a group, you don’t know anything about whether that plays out between groups, not even in the sign of that direction…. It could be that Africans are at a genetic advantage for cognitive ability that’s been swamped by environmental risks and adversity. … That’s labeled the ecological fallacy, that’s like a basic statistical point. … We have no information, no default, about what is the relationship between differences in genetic ancestry and the causes of these cognition differences that we see on average, between groups. And in the absence of any data, and really good data, the only priors we have are informed by what? So that’s why I think the prior that there is a genetic difference, and what is more that it works in this particular direction, is not informed by the science.”

Murray and Harris can’t see this, or refuse to. They keep “defaulting” to the assumption that because traits like intelligence are somewhat heritable at the individual level, and there are group differences in the observed traits (“outcomes”) — therefore group differences are at least partly the result of genetics. It’s just a matter of time till we find out. Harris responds:

“My default assumption here … for the hundred things we care about in a person, given how much we are learning about the role that genes play in making us who we are, physically, and psychologically … we will find that genes are involved for virtually everything, to some degree. And in many cases it will be the difference that really makes most of the difference, and this is true for individuals, and we will find it true for groups.”

When he says it’s an assumption, and he won’t change it regardless of the lack of evidence, there’s no point in arguing. He’s not talking about science. I would make stronger arguments against the enterprise itself — the idea of trying to find genetic group differences to explain inequality between groups — than she did in the interview, but in any event Harden is clear that Murray and Harris are wrong on this point, as many others have explained as well. (Of course, the existence of meaningful genetic differences between ancestral populations in complex behavioral traits like intelligence is itself purely speculative. Variations evolve at random and might end up sticking if they provide a survival advantage, like lack of skin pigmentation, but it’s not likely humanity was divided up into different populations where some groups were selected for intelligence and others weren’t — unlike skin pigmentation, intelligence is handy for everyone.) Anyway, that’s all backstory to the quote below. Harris keeps saying his real concern is with intellectual honesty and the perils of cancel culture. And he says this:

“The real question is what is the cause of all these disparities. The real problem politically, at the moment, is when you’re talking about White-Black differences in American society, differences in outcome, differences in, you know, inner-city crime, differences in wealth inequality, all of it – anything that people could care about – the only acceptable answer in many quarters, to account for these differences, is White racism, or systemic racism, right, institutional racism. Some holdover effect from slavery and Jim Crow. And a failure to see it that way, just reflexively, is synonymous with being a racist, or being unaware of the depth of racism. White fragility – we’re having this conversation at the moment when the best selling book in the country is White Fragility, right. So to be doubtful that White racism accounts for all of these disparities – you know, White racism, again, in the year 2020, not to be in any doubt about the ugly history of racism in American, to be in doubt about whether racism explains the number of shootings we’re going to see in Chicago this weekend – and the fact that I can predict with something like 100 percent certainty that most of those shootings will be Black-on-Black crime, right, is it White racism that explains that? To have doubt about that will cast you as a malevolent imbecile in many, many quarters, now, and you risk reputational destruction. And the only safe space is to say, ‘Of course it’s White racism, that’s the problem we gotta solve.’ And that is such a stultifying and frankly dishonest place to be, intellectually, at the moment, and it’s closing down conversation on dozens of important topics, and it puts us in a position, insofar as we’re fighting from this trench, right, we’re all just hunkered down against all possible future insights, in this spot, it is deeply unstable, because we will find out things – differences among groups, again now speaking widely about all human difference, among all groups – and differences among individuals, that are simply not amenable to a politically correct analysis, and, again, there’s this inconvenient fact that we have these differences between Asians and Whites, right, so if White racism accounts for every possible difference between Whites and Blacks in society, is there a pro-Asian racism that’s account for the fact that Whites are performing so badly on IQ tests? That’s hard to argue for.”

You have to love how he goes from “dozens” of questions, and “all human difference, among all groups” straight to violence and IQ. But the whole rant is the tell that it’s a truther-style conspiracy theory. When a 9/11 truther finds any discrepancy or incomplete element in the official explanation for the 9/11 attacks — like something about the melting point of steel, or a missing document or garbled radio transmission — they assume it’s evidence the CIA did it. See! How can you believe them?! This is what I’m telling you! In fact, any complex scientific story will have potential discrepancies and inadequacies yet to be explained (“science is not designed for proving absolute negatives”). But those things are not evidence for a particular different theory — unless they really are. Once they plant the idea of their counternarrative, any weakness in the accepted story becomes evidence for their side. So, if “Asians” do better on IQ tests than Whites do, and if Black people kill each other in Chicago, facts that supposedly undermine the “White racism causes everything” story, it’s basically evidence that Blacks are genetically inferior — probably or maybe, blah, blah, blah racism — and you just can’t admit it.

Just in case you were wondering whether people who make this kind of assumption are thinking scientifically, they’re not. That’s a thought I had after some reading and listening. I think I’ll read her book.

Pandemic Baby Bust situation update

[Update: California released revised birth numbers, which added a trivial number to previous months, except December, where they added a few thousand, so now the state has a 10% decline for the month, relative to 2019. I hadn’t seen a revision that large before.]

Lots of people are talking about falling birth rates — even more than they were before. First a data snapshot, then a link roundup.

For US states, we have numbers through December for Arizona, California, Florida, Hawaii, and Ohio. They are all showing substantial declines in birth rates from previous years. Most dramatically, California just posted December numbers, and revised the numbers from earlier months, now showing a 19% 10% drop in December. After adding about 500 births to November and a few to October, the drop in those two months is now 9%. The state’s overall drop for the year is now 6.2%. These are, to put it mildly, very larges declines in historical terms. Even if California adds 500 to December later, it will still be down 18%. Yikes. One thing we don’t yet know is how much of this is driven by people moving around, rather than just changes in birth rates. California in 2019 had more people leaving the state (before the pandemic) than before, and presumably there have been essentially no international immigrants in 2020. Hawaii also has some “birth tourism”, which probably didn’t happen in 2020, and has had a bad year for tourism generally. So much remains to be learned.

Here are the state trends (figure updated Feb 18):

births 18-20 state small multiple by month

From the few non-US places that I’m getting monthly data so far, the trend is not so dramatic. Although British Columbia posted a steep drop in December. I don’t know why I keep hoping Scotland will settle down their numbers… (updated Feb 18):

births countries 18-20 small multiple by month

Here are some recent items from elsewhere on this topic:

  • That led to some local TV, including this from KARE11 in Minneapolis:

Good news / bad news clarification

There’s an unfortunate piece of editing in the NBCLX piece, where I’m quoted like this: “Well, this is a bad situation. [cut] The declines we’re seeing now are pretty substantial.” To clarify — and I said this in the interview, but accidents happen — I am not saying the decline in births is a bad situation, I’m saying the pandemic is a bad situation, which is causing a decline in births. Unfortunately, this has slipped. As when the Independent quoted the piece (without talking to me) and said, “Speaking to the outlet, Philip Cohen, a sociologist and demographer at the University of Maryland, called the decline a ‘bad situation’.”


The data for this project is available here: osf.io/pvz3g/. You’re free to use it.


For more on fertility decline, including whether it’s good or bad, and where it might be going, follow the fertility tag.


Acknowledgement: We have lots of good conversation about this on Twitter, where there is great demography going on. Also, Lisa Carlson, a graduate student at Bowling Green State University, who works in the National Center for Family and Marriage Research, pointed me toward some of this state data, which I appreciate.

COVID-19 mortality rates by race/ethnicity and age

Why are there such great disparities in COVID-19 deaths across race/ethnic groups in the U.S.? Here’s a recent review from New York City:

The racial/ethnic disparities in COVID-related mortality may be explained by increased risk of disease because of difficulty engaging in social distancing because of crowding and occupation, and increased disease severity because of reduced access to health care, delay in seeking care, or receipt of care in low-resourced settings. Another explanation may be the higher rates of hypertension, diabetes, obesity, and chronic kidney disease among Black and Hispanic populations, all of which worsen outcomes. The role of comorbidity in explaining racial/ethnic disparities in hospitalization and mortality has been investigated in only 1 study, which did not include Hispanic patients. Although poverty, low educational attainment, and residence in areas with high densities of Black and Hispanic populations are associated with higher hospitalizations and COVID-19–related deaths in NYC, the effect of neighborhood socioeconomic status on likelihood of hospitalization, severity of illness, and death is unknown. COVID-19–related outcomes in Asian patients have also been incompletely explored.

The analysis, interestingly, found that Black and Hispanic patients in New York City, once hospitalized, were less likely to die than White patients were. Lots of complicated issues here, but some combination of exposure through conditions of work, transportation, and residence; existing health conditions; and access to and quality of care. My question is more basic, though: What are the age-specific mortality rates by race/ethnicity?

Start tangent on why age-specific comparisons are important. In demography, breaking things down by age is a basic first-pass statistical control. Age isn’t inherently the most important variable, but (1) so many things are so strongly affected by age, (2) so many groups differ greatly in their age compositions, and (3) age is so straightforward to measure, that it’s often the most reasonable first cut when comparison groups. Very frequently we find that a simple comparison is reversed when age is controlled. Consider a classic example: mortality in a richer country (USA) versus a poorer country (Jordan). People in the USA live four years longer, on average, but Americans are more than twice as likely to die each year (9 per 1,000 versus 4 per 1000). The difference is age: 23% of Americans are over age 60, compared with 6% of Jordanians. More old people means more total deaths, but compare within age groups and Americans are less likely to die. A simple separation by age facilitates more meaningful comparison for most purposes. So that’s how I want to compare COVID-19 mortality across race/ethnic groups in the USA. End tangent.

Age-specific mortality rates

It seems like this should be easier, but I can’t find anyone who is publishing them on an ongoing basis. The Centers for Disease Control posts a weekly data file of COVID-19 deaths by age and race/ethnicity, but they do not include the population denominators that you need to calculate mortality rates. So, for example, it tells you that as of December 5 there have been 2,937 COVID-19 deaths among non-Hispanic Blacks in the age range 30-49, compared with 2,186 deaths among non-Hispanic Whites of the same age. So, a higher count of Black deaths. But it doesn’t tell you there are 4.3-times as many Whites as Blacks in that category. So a much higher mortality rate.

On a different page, they report the percentage of all deaths in each age range that have occurred in each race/ethnic group, don’t include their percentage in the population. So, for example, 36% of the people ages 30-39 who have died from COVID-19 were Hispanic, and 24% were non-Hispanic White, but that’s not enough information to calculate mortality rates either. I have no reason to think this is nefarious, but it’s clearly not adequate.

So I went to the 2019 American Community Survey (ACS) data distributed by IPUMS.org to get some denominators. These are a little messy for two main reasons. First, ACS is a survey that asks people what their race and ethnicity are, while death counts are based on death certificates, for which the person who has died is not available to ask. So some people will be identified with a different group when they die than they would if they were surveyed. Second, the ACS and other surveys allow people to specify multiple races (in addition to being Hispanic or not), whereas death certificate data generally does not. So if someone who identifies as Black-and-White on a survey dies, how will the death certificate read? (If you’re very interested, here’s a report on the accuracy of death certificates, and here are the “bridges” they use to try to mash up multiple-race and single-race categories.)

My solution to this is make denominators more or less the way race/ethnicity was defined before multiple race identification was allowed. I put all Hispanic people, regardless of race, into the Hispanic group. Then I put people who are White, non-Hispanic, and no other race into the White category. And then for the Black, Asian, and American Indian categories, I include people who were multiple race (and not Hispanic). So, for example, a Black-White non-Hispanic person is counted as Black. A Black-Asian non-Hispanic person is counted as both Black and Asian. Note I did also do the calculations for Native Hawaiian and Other Pacific Islanders, but those numbers are very small so I’m not showing them on the graph; they’re on the spreadsheet. Note also I say “American Indian” to include all those who are “non-Hispanic American Indian or Alaska Native.”

This is admittedly crude, but I suggest that you trust me that it’s probably OK. (Probably OK, that is, especially for Whites, Blacks, and Hispanics. American Indians and Asians have higher rates of multiple-race identification among the living, so I expect there would be more slippage there.)

Anyway, here’s the absolutely egregious result:

This figure allows race/ethnicity comparisons within the five age groups (under 30 isn’t shown). It reveals that the greatest age-specific disparities are actually at the younger ages. In the range 30-49, Blacks are 5.6-times more likely to die, and Hispanics are 6.6-times more likely to die, than non-Hispanic Whites are. In the oldest age group, over 85, where death rates for everyone are highest, the disparities are only 1.5- and 1.4-to-1 respectively.

Whatever the cause of these disparities, this is just the bottom line, which matters. Please note how very high these rates are at old ages. These are deaths per 100,000, which means that over age 85, 1.8% of all African Americans have died of COVID-19 this year (and 1.7% for Hispanics and 1.2% for Whites). That is — I keep trying to find words to convey the power of these numbers — one out of every 56 African Americans over age 85.

Please stay home if you can.

A spreadsheet file with the data, calculations, and figure, is here: https://osf.io/ewrms/.

Alexa, Hillary, Melania/Ivanka, Donald, Mary, and the top 20 names from last year

First, very happy I finally learned how to make this kind of chart in Stata (thanks to a tip from Stephen McKay). These are the top 20 boys’ and girls’ given in the US last year, compared with how they did in 2018.

Liam and Noah still running away with the top spots for boys, while traditional names like William, Alexander, Jacob, Michael, and Daniel lost ground. Surprisingly little change among girls. (If you want to make these, you can look at my code here, just ignore the messy data stuff and look at the figure code at the end).

Next, updating some trends to watch.

First, what happens when a giant monopoly storms in and takes over a perfectly nice name enjoying a healthy run among semi-popular names: Alexa. Alexa got up the #43 position in 2009, possibly got a bounce from early Alexa talk around Amazon (which started in 2014), and then was destroyed by becoming a household name. Reminded by Kieran of another disaster, with a different etiology — Hillary — I combined them here:

On a percentage basis Hillary had a worse slide, but Alexa has further to fall, and she’s already down 65% in four years. Alexa, get me a nickname.

Quick update on the tension you all know you feel between Ivanka and Melania, this year I’m happy to say they can share a common bond over their names having both peaked in 2017 at under 300 each. For comparison, and just to rub it in, I added the trend for Malia as well:

Donald, on the other hand, was not ruined by the president who ruins everything he touches, but rather is just being ignored — his least favorite experience — with steady fall apparently unaffected by his omnipresence.

All the people who claim to love the Donald don’t really want their boys to be him, they want to be able to hide their deplorable allegiance from polite society when the shit goes south. No loyalty.

Not completely unlike Mary, a figure revered by many, but not quite enough to name their girls after her anymore. She’s down another 118 girls, to 2209, or 1.21 per 1000. (The whole Mary story, and the methods for how I got data back to 1780, are in my book.)

On the plus side, Mary will be glad to hear that her nemesis Nevaeh, is also on the way down, having lost half her number since peaking at more than 6000 girls in 2010.

Data and code for this project are on the Open Science Framework, here.

Are pandemic effects on birth rates already detectable?

As birth data approaches, maybe we can get beyond analyses like Google searches for pregnancy-related terms to see what’s happening with birth rates.

At this writing we are a few days shy of 35 weeks from February 1st. If I read this right, 10% of US births occur at 36 weeks of gestation or less. But the most recent complete data I see is from August, so it’s early. However, most fertilized human eggs do not come to term, being lost either before (30%) or after (30-40%) implantation. That’s from a paper by Jenna Nobles and Amar Hamoudi, who write:

Evidence suggests that multiple mechanisms may be involved in pregnancy survival, including those that affect placental development and function, fetal oxidative stress, fetal neurological development, and likely many others. These, in turn, are shaped by more distal processes that affect maternal nutrition, maternal exposure to biological and psychosocial stress, maternal exposure to infection, and management of chronic conditions. Pregnancy survival varies with women’s body mass index, consumption of folic acid, and in some studies, reports of stressful life events (citations removed).

The pandemic might reasonably have contributed to a higher rate of pregnancy loss from these factors. And then there are abortions, which people have probably needed more even though they had less access to them (see this report from Guttmacher). So the net effect is unclear.

Setting aside how the pandemic might have affected fertility intentions and planning (I assume this is negative, as reported by Guttmacher), there might already be fewer births, from loss and abortion.

I haven’t looked at every state, but Florida and California report births by month. In Florida, there were 9.5% fewer babies born in August 2020 than in the previous year (they revise these as they go, but the August number has been stable for a little while, so probably won’t increase much). In California there were 9.6% fewer births in August of this year compared with last year. Here are the monthly trends, including the last three years (I included Florida’s September number as of today, but that will certainly rise):

This is going to be tricky because birth rates were already falling in many places. But the average decline in the last three years was 2.9% in California and 0.7% in Florida, so these numbers clearly outpace that naïve expectation. Also, what about spring? Maybe the pandemic was already causing a decline in live births in California in March (from immigrants not coming or staying in Mexico or other countries?), but if the decline in March was unrelated, then it’s not clear how to interpret the drop in August. So it will be complicated. But this is a bona fide blip in the expected direction, so I’m posting it with a question mark.

I assume other people will be way ahead of me on this, though I haven’t seen anything. Feel free to post other analyses in the comments.

Why there are 3.1 million extra young adults living at home

Answer: The COVID-19 pandemic.

UPDATE: A new post updates this analysis for July 2020

Catherine Rampell tweeted a link to a Zillow analysis showing 2.2 million adults ages 18-25 moving in with their parents or grandparents in March and April. Zillow’s Treh Manhertz estimates these move-homers would cost the rental market the better part of a billion dollars, or 1.4% of total rent if they stay home for a year.

We now have the June Current Population Survey data to work with, so I extended this forward, and did it differently. CPS is the large, monthly survey that the Census Bureau conducts for the Bureau of Labor Statistics each month, principally to track labor market trends. It also includes basic demographics and living arrangement information. Here is what I came up with.*

Among people ages 18-29, there is a large spike of living in the home of a parent or grandparent (of themselves or their spouse), which I’ll call “living at home” for short. This is apparent in a figure that compares 2020 with the previous 5 years (click figures to enlarge):

six year trends

From February to April, the percentage of young adults living at home jumped from 43% to 48%, and then up to 49% in June. Clearly, this is anomalous. (I ran it back to 2008 just to make sure there were no similar jumps around the time of the last recession; in earlier years the rates were lower and there were no similar spikes.) This is a very large disturbance in the Force of Family Demography.

To get a better sense of the magnitude of this event, I modeled it by age, sex, and race/ethnicity. Here are the estimated share of adults living at home by age and sex. For this I use just June of each year, and compare 2020 with the pooled set of 2017-2019. This controls for race/ethnicity.

men and women

The biggest increase is among 21-year-old men and 20-year-old women, and women under 22 generally. These may be people coming home from college, losing their jobs or apartments, canceling their weddings, or coming home to help.

I ran the same models but broke out race/ethnicity instead (for just White, Black, and Latino, as the samples get small).

white black latino

This shows that the 2020 bounce is greatest for Black young adults (below age 24) and the levels are lowest for Latinos (remember that many Latinos are immigrants whose parents and grandparents don’t live in the US).

To show the total race/ethnic and gender pattern, here are the predicted levels of living at home, controlling for age:

raceth-gender

The biggest 2020 bounce is among Black men and women, with Black men having the highest overall levels, 58%, and White women having the lowest at 44%.

In conclusion, millions of young adults are living with their parents and grandparents who would not be if 2020 were like previous years. The effect is most pronounced among Black young adults. Future research will have to determine which of the many possible disruptions to their lives is driving this event.

For scale, there are 51 million (non-institutionalized) adults ages 18-29 in the country. If 2020 was like the previous three years, I would expect there to be 22.2 million of them living with their parents. Instead there are 25.4 million living at home, an increase of 3.1 million from the expected number (numbers updated for June 2020). That is a lot of rent not being spent, but even with that cost savings I don’t think this is good news.


* The IPUMS codebook, Stata code, spreadsheet, and figures are in an Open Science Framework project under CC0 license here: osf.io/2xrhc.

What happens next

Wouldn’t you like to know.

“The pandemic has exposed the messiness of science. … We all want answers today, and science is not going to give them. … Science is uncertainty. And the pace of uncertainty reduction in science is way slower than the pace of a pandemic.” —Brian Nosek

Microsoft PowerPoint - games of chance.pptx
click to enlarge

The math of probability is manageable, up to a point. In principle, you could calculate the odds of, for example (clockwise from top left) getting heads on the fake Trump coin, a novel coronavirus linking up to human proteins, surviving a round of Russian roulette, having someone with COVID-19 at your planned event, rolling 6-6-4-8-20 on the dice, have all the marbles fall under a normal curve on a Galton board, and then surviving a flight from New York to Los Angeles without hitting one of the thousands of other planes in the air. But that doesn’t mean we can tell where humanity will be a year from now.

Of course things are always too complicated to predict perfectly, but “normally” we bracket uncertainty and make simplifying assumptions so we can work with forecasts of say, tomorrow’s weather or next quarter’s economic growth — which are strongly bounded by past experience. These are “the models” we live by. The problem now is not that reality has become objectively harder to predict, it’s that the uncertainties in the exercise (those most relevant to our lives) involve events with such catastrophic consequences that a normal level of uncertainty now includes outcomes so extreme that we can’t process them meaningfully.

Once nominally predicable events start influencing each other in complex ways, the uncertainty grows beyond the capacity of simple math. Instead of crunching every possibility, we simplify the assumptions, based on past experience and the outcomes we consider possible. Today’s would-be predictions, however, involve giant centrifugal forces, so that small deviations can lead to disintegration. For example, if the pandemic further tanks the economy, which provides more unemployed people to populate mass protests, leading to more military crackdown and turning more people against Trump, it might make him lash out more at China, and then they might not share their vaccine with us, and an epidemic wave could overwhelm the election and its aftermath, giving Trump a pretext for nullifying the results. And so on.

To make matters more ungraspable, we personally want to know what’s going on at the intersection of micro and macro forces, where we don’t have the data to use even if we knew what to do with it.

Examples

For example, as individuals try to ballbark their own risks of covid-19 infection, and the likelihood of a serious outcome in that event — given their own health history — they might also want to consider whether they have been exposed to tear gas at the hands of police or the military, which might increase the chance of infection. In that case, both the individual and the state are acting without quantifiable information on the risks. For another example, Black people in America obviously have a reasonable fear of police violence — with potentially immeasurable consequences — but taking the risk to participate in protests might contribute to political changes that end up reducing that risk (for themselves and their loved ones). The personal risks are affected by policy decisions and organizational practices, but you can’t predict (much less control) the outcomes.

Individual risks are affected by group positions, of course, creating diverging profiles that splinter out to the individual level. Here’s an example: race and widowhood. We all know that as a married couple ages, the chance that one of the partners will die increases. But that relationship between age and widowhood differs markedly between Blacks and Whites, as this figure shows:

widowhood probability

Before age 70, the annual probability of a Black woman being widowed is more than twice the chance that White women face. (After that, the odds are higher, but not as dramatically so.) Is this difference big enough to affect people’s decision making, their emotions, their relationships? I think so, though I can’t prove it. Even if people don’t map out the calculation at this level (though they of course think about their own and their partner’s specific health situation), it’s in there somewhere.

For most people, widowhood presents a pretty low annual probability of a very bad event, one that might turn your life upside down. On the other hand, climate change is certain, and observable over the course of a contemporary adult’s lifetime (look at the figure below, from 1980 to 2020). But although climate change presents potentially catastrophic consequences, the risks aren’t easily incorporated into life choices. If you’re lucky, you might have to think about the pros and cons of owning beachfront property. Or you could be losing a coal job, or gaining a windmill job. But I think for most people in the U.S. it’s in the category of background risk — which might motivate political participation, for example, but doesn’t hang over one’s head as a sense of life-threatening risk.

temperature anonamlies trend

If not imminent fear, however, climate change undoubtedly contributes to a climate of uncertainty about the future. Interestingly, there is a robust debate about whether and how climate change is also increasing climate variability. Rising temperatures alone would create more bad storms, floods, and droughts, but more temperature fluctuation would also have additional consequences. I was interested to read this paper which showed models predicting greater change in temperature variability (on the y-axis) for the rest of the century in countries with lower per-capita income (x-axis). When it comes to inequality, it rains and it pours. And for people in poorer countries of the world, it’s raining uncertainty.

tempvar

What comes next

I wrote about unequal uncertainties in April, and possible impacts on marriage rates, and I’ve commented elsewhere on fertility and family violence. But I’m not making a lot of predictions. Are other social scientists? My impression they’re mostly wisely holding off. My sense is also that this may be part of a longer-term pattern, where social scientists once made more definitive predictions with less sophisticated models than we do now that we’re buried in data. Is it the abundance of data that makes predicting seem like a bad business? I don’t think that’s it. I think it’s the diminished general confidence in the overall direction of social change. Or maybe predictions have just become more narrow — less world revolution and more fourth quarter corn prices.

One of the books I haven’t written yet, crappily titled Craptastic when I pitched it in 2017, would address this:

My theory for Craptastic is that the catastrophic thinking and uncontrollable feelings of impending doom go beyond the very reasonable reaction to the Trump shitshow that any concerned person would have, and reflect a sense that things are turning around in a suddenly serious way, rupturing what Anthony Giddens describes as the progress narratives of modernity people use to organize their identities. People thought things were sort of going to keep getting better, arc of the moral universe and all that, but suddenly they realize what a naive fantasy that was. It’s not just terrible, it’s craptastic. …

I suspect that if America lives to see this chapter of its decline written, Trump will not be as big a part of the story as it seems he is right now. And that impending realization is one reason for the Trump-inspired dysphoria that so many people are feeling.

Social science is unlikely anytime soon to be the source of reassurance about the future some people might be looking for — not even the reassurance that things will get better, but just confidence that we know what direction we’re headed, and at what speed. I don’t know, but if you know, feel free to leave it in the comments. (Which are moderated.)

Rural COVID-19 paper peer reviewed. OK?

Twelve days ago I posted my paper on the COVID-19 epidemic in rural US counties. I put it on the blog, and on the SocArXiv paper server. At this writing the blog post has been shared on Facebook 69 times, the paper has been downloaded 149 times, and tweeted about by a handful of people. No one has told me it’s wrong yet, but not one has formally endorsed it yet, either.

Until now, that is. The paper, which I then submitted to the European Journal of Environment and Public Health, has now been peer reviewed and accepted. I’ve updated the SocArXiv version to the journal page proofs. Satisfied?

It’s a good question. We’ll come back to it.

Preprints

The other day (I think, not good at counting days anymore) a group of scholars published — or should I say posted — a paper titled, “Preprinting a pandemic: the role of preprints in the COVID-19 pandemic,” which reported that there have already been 16,000 scientific articles published about COVID-19, of which 6,000 were posted on preprint servers. That is, they weren’t peer-reviewed before being shared with the research community and the public. Some of these preprints are great and important, some are wrong and terrible, some are pretty rough, and some just aren’t important. This figure from the paper shows the preprint explosion:

F1.large

All this rapid scientific response to a worldwide crisis is extremely heartening. You can see the little sliver that SocArXiv (which I direct) represents in all that — about 100 papers so far (this link takes you to a search for the covid-19 tag), on subjects ranging from political attitudes to mortality rates to traffic patterns, from many countries around the world. I’m thrilled to be contributing to that, and really enjoy my shifts on the moderation desk these days.

On the other hand some bad papers have gotten out there. Most notoriously, an erroneous paper comparing COVID-19 to HIV stoked conspiracy theories that the virus was deliberately created by evil scientists. It was quickly “withdrawn,” meaning no longer endorsed by the authors, but it remains available to read. More subtly, a study (by more famous researchers) done in Santa Clara County, California, claimed to find a very high rate of infection in the general population, implying COVID-19 has a very low death rate (good news!), but it was riddled with design and execution errors (oh well), and accusations of bias and corruption. And some others.

Less remarked upon has been the widespread reporting by major news organizations on preprints that aren’t as controversial but have become part of the knowledge base of the crisis. For example, the New York Times ran a report on this preprint on page 1, under the headline, “Lockdown Delays Cost at Least 36,000 Lives, Data Show” (which looks reasonable in my opinion, although the interpretation is debatable), and the Washington Post led with, “U.S. Deaths Soared in Early Weeks of Pandemic, Far Exceeding Number Attributed to Covid-19,” based on this preprint. These media organizations offer a kind of endorsement, too. How could you not find this credible?

postpreprint

Peer review

To help sort out the veracity or truthiness of rapid publications, the administrators of the bioRxiv and medRxiv preprint servers (who are working together) have added this disclaimer in red to the top of their pages:

Caution: Preprints are preliminary reports of work that have not been certified by peer review. They should not be relied on to guide clinical practice or health-related behavior and should not be reported in news media as established information.

That’s reasonable. You don’t want people jumping the gun on clinical decisions, or news reports. Unless they should, of course. And, on the other hand, lots of peer reviewed research is wrong, too. I’m not compiling examples of this, but you can always consult the Retraction Watch database, which, for example, lists 130 papers published in Elsevier journals in 2019 that have been retracted for reasons ranging from plagiarism to “fake peer review” to forged authorship to simple errors. The database lists a few peer-reviewed COVID-19 papers that have already been retracted as well.

This comparison suggests that the standard of truthiness cannot be down to the simple dichotomy of peer reviewed or not. We need signals, but they don’t have to be that crude. In real life, we use a variety of signals for credibility that help determine how much to trust a piece of research. These include:

  • The reputation of the authors (their degrees, awards, twitter following, media presence)
  • The institutions that employ them (everyone loves to refer to these when they are fancy universities reporting results they favor, e.g., “the Columbia study showed…”)
  • Who published it (a journal, an association, a book publisher), which implies a whole secondary layer of endorsements (e.g., the editor of the journal, the assumed expertise of the reviewers, the prestige or impact factor of the journal as a whole, etc.)
  • Perceived conflicts of interest among the authors or publishers
  • The transparency of the research (e.g., are the data and materials available for inspection and replication)
  • Informal endorsements, from, e.g., people we respect on social media, or people using the Plaudit button (which is great and you should definitely use if you’re a researcher)
  • And finally, of course, our own assessment of the quality of the work, if it’s something we believe ourselves qualified to assess

As with the debate over the SAT/GRE for admissions, the quiet indicators sometimes do a lot of the work. Call something a “Harvard study” or a “New York Times report,” and people don’t often pry into the details of the peer review process.

Analogy: People who want to eat only kosher food need something to go on in daily life, and so they have erected a set of institutional devices that deliver such a seal (in fact, there are competing seal brands, but they all offer the same service: a yes/no endorsement by an organization one decides to trust). The seals cost money, which is added to the cost of the food; if people like it, they’re willing to pay. But, as God would presumably tell you, the seal should not always substitute for your own good judgment because even rabbis or honest food producers can make mistakes. And in the absence of a good kosher inspection to rely on altogether, you still have to eat — you just have to reason things through to the best of your ability. (In a pinch, maybe follow the guy with the big hat and see what he eats.) Finally, crucially for the analogy, anyone who tells you to ignore the evidence before you and always trust the authority that’s selling the dichotomous indicator is probably serving their own interests as least as much as they’re serving yours.

In the case of peer review, giant corporations, major institutions, and millions of careers depend on people believing that peer review is what you need to decide what to trust. And they also happen to be selling peer review services.

My COVID-19 paper

So should you trust my paper? Looking back at our list, you can see that I have degrees and some minor awards, some previous publications, some twitter followers, and some journalists who trust me. I work at a public research university that has its own reputation to protect. I have no apparent way of profiting from you believing one thing or another about COVID-19 in rural areas (I declared no conflicts of interest on the SocArXiv submission form). I made my data and code available (even if no one checks it, the fact that it’s there should increase your confidence). And of course you can read it.

And then I submitted it to the European Journal of Environment and Public Health, which, after peer review, endorsed its quality and agreed to publish it. The journal is published by Veritas Publications in the UK with the support of Tsinghua University in China. It’s an open access journal that has been publishing for only three years. It’s not indexed by Web of Science or listed in the Directory of Open Access Journals. It is, in short, a low-status journal. On the plus side, it has an editorial board of real researchers, albeit mostly at lower status institutions. It publishes real papers, and (at least for now) it doesn’t charge authors any  publication fee, it does a little peer review, and it is fast. My paper was accepted in four days with essentially no revisions, after one reviewer read it (based on the summary, I believe they did read it). It’s open access, and I kept my copyright. I chose it partly because one of the papers I found on Google Scholar during my literature search was published there and it seemed OK.

So, now it’s peer reviewed.

Here’s a lesson: when you set a dichotomous standard like peer-reviewed yes/no and tell the public to trust it, you create the incentive for people to do the least they can to just barely get over that bar. This is why we have a giant industry of tens of thousands of academic journals producing products all branded as peer reviewed. Half a century ago, some academics declared themselves the gatekeepers of quality, and called their system peer review. To protect the authority of their expertise (and probably because they believed they knew best), they insisted it was the standard that mattered. But they couldn’t prevent other people from doing it, too. And so we have a constant struggle over what gets to be counted, and an effort to disqualify some journals with labels like “predatory,” even though it’s the billion-dollar corporations at the top of this system that are preying on us the most (along with lots of smaller scam purveyors).

In the case of my paper, I wouldn’t tell you to trust it much more because it’s in EJEPH, although I don’t think the journal is a scam. It’s just one indicator. But I can say it’s peer reviewed now and you can’t stop me.

Aside on service and reciprocity: Immediately after I submitted my paper, the EJEPH editors sent me a paper to review, which I respect. I declined because it wasn’t qualified, and then they sent me another. This assignment I accepted. The paper was definitely outside my areas of expertise, but it was a small study quite transparently done, in Nigeria. I was able to verify important details — like the relevance of the question asked (from cited literature), the nature of the study site (from Google maps and directories), the standards of measurement used (from other studies), the type of the instruments used (widely available), and the statistical analysis. I suggested some improvements to the contextualization of the write-up and recommended publication. I see no reason why this paper shouldn’t be published with the peer review seal of approval. If it turns out to be important, great. If not, fine. Like my paper, honestly. I have to say, it was a refreshing peer review experience on both ends.

How big will the drop in weddings be? Big

With data snapshot addendum at the end.

In the short run, people are canceling their weddings that were already booked, or not planning the ones they were going to have this summer or fall. In the long run, we don’t know.

To look at the short run effect, I used Google Trends to extract the level of traffic for five searches over the last five years: wedding dressesbridal shower, wedding licensewedding shower, and wedding invitations (here is the link to one, just change the terms to get the others). These are things you Google when you’re getting married. Google reports search volume for each term weekly, scaled from 0 to 100.

Search traffic for these terms is highly correlated with each other across weeks, between .45 and .76. I used Stata to combine them into an index (alpha = .92), which ranges from 22 to 87 for 261 weeks, from May 2015 to last week.

For the graph, I smoothed the trend with a 5-week average. Here is the trend, with dates for the peaks and troughs (click to enlarge):

wedding plans searches.xlsx

The annual pattern is very strong. Each year people people do a lot of wedding searches for about two months, from mid-January to early-March, before traffic falls for the rest of year, until after Thanksgiving. There is a decline over these five years, but I don’t put too much stock in that because maybe the terms people use are changing over time.

But this year there is a break. After starting out with a normal spike in mid-January, searches lurched downward into February, and then collapsed to their lowest level in five years — at what should have been the height of the wedding Google search season.

Clearly, there will be a decline in weddings this spring and summer, or until we “reopen,” whatever that means. A lot of people just can’t get married. When you think about the timing of marriage, most people getting married in a given year are probably already planning to at least half a year in advance. So even if no one’s relationships are affected, and their long term plans don’t change, we’ll still see a decline in marriages this year just from canceled plans.

Beyond that, however, people probably aren’t meeting and falling in love as much. People who are dating probably aren’t as likely to advance their relationships through what would have been a normal development – dating, love, kids, marriage, and so on. So a lot of existing relationships – even for people who weren’t engaged – probably aren’t moving toward marriage. Even if they get back on track later, that’s a delay of a year or two or however long. This says nothing about people being stressed, miserable, sick (or worse), and otherwise not in any kind of mood.

In the longer term, what does the pandemic mean for confidence in the future? The crisis will undermine people’s ability to make long term decisions and commitments. Unless the cultural or cognitive model of marriage changes, insecurity or instability will mean less marriage in the future. This could be a long term effect even after the acute period passes.

What about a rebound? Eventually – again, whenever that is – there probably will be some rebound. At least, just practically, some people who put off marriage will go ahead and do it later. Although, as with delayed births, some postponed marriages probably will end up being foregone. On a larger scale, when people can get out and get together and get married again, there might well be a marriage bounce (and also even a baby boom). Presumably that would depend on a very positive scenario: a vaccine, an economic resurgence, maybe a big government boost, like after WWII. A surge in optimism about the future, happiness. That’s all possible. This also depends on the cultural model of marriage we have now, so that good times equals more marriage (and childbearing). In real life, any such effect might be small, dwarfed by big declines from chaos, fear, and uncertainty. I can’t predict how these different impulses might play against each other. However, on balance, my out-on-a-limb forecast is a decline in marriage.

kissing sailor

Data snapshot addendum

I didn’t realize there was monthly data available already. For example, in Florida they release monthly marriage counts by county, and they have released the April numbers. These show a 1% increase in marriages year-over-year in January, a 31% increase in February, then a 31% drop in March and a 72% drop in April [Since I first posted this, Florida added 477 more marriages in April, and a few in the earlier months, changing these percentages by a couple points as on June 5. -pnc.] Here is a scatter plot [updated] showing the count of marriages by county in 2019 and 2020. Counties below the diagonal have fewer marriages in 2020 than they did in 2019. Not surprising, but still dramatic to see it happening in “real time” (not really, just in quickly available data).

florida marriages.xlsx

COVID lecture for Social Problems class

On March 2, I opened up my Social Problems class to questions on the emerging coronavirus epidemic. One of the things I did was show them a graph of worldwide cases on a log scale, and told them that it implied the world would have a million cases a month later. We hit that number to the day two days ago. Here are my notes from that day:

A month later, with school indeed canceled (which I had given only a 10-20% on March 2, I recorded this 28-minute lecture for them as an update. Feel free to use any part of it any way you like*:


* Two notes, having watched it over myself and gotten some feedback:

  1. At 4:40 I said of the graph shown: “The number of new cases confirmed by testing, every day, in the country, since February.” I should have said, “in the world” (as the figure is labeled).
  2. It’s been pointed out that social distancing and other responses to the outbreak are not the only thing that differentiate trajectories of the different outbreaks around the country. Also relevant is the demography of the area, including age, as well as health status and healthcare infrastructure. Those factors will emerge as the pandemic matures.