Author meets critic: Margaret K. Nelson, Like A Family

Like Family

These are notes for my discussion of Like Family, Narratives of Fictive Kinship, by Margaret K. Nelson. Author Meets Critics session at the Eastern Sociological Society, 21 Feb 2021.

Like A Family is a fascinating, enjoyable read, full of thought-provoking analysis and a lot of rich stories, with detailed scenarios that let the reader consider lots of possibilities, even those not mentioned in the text. It’s “economical prose” that suggests lots of subtext and brings to mind a lot of different questions (some of which are in the wide-ranging footnotes).

It’s about people choosing relationships, and choosing to make them be “like” family, and how that means they are and are not “like” family, and in the process tells us a lot about how people think of families altogether, in terms of bonds and obligations and language and personal history.

In my textbook I use three definitions: the legal family, the personal family, and the family as an institutional arena. This is the personal family, which is people one considers family, on the assumption or understanding they feel the same way.

Why this matters, from a demographer perspective: Most research uses household definitions of family. That’s partly because some things we have to measure, and it’s a way to make sure we only get each person once (without a population registry or universal identification), and correctly attribute births to birth parents. But it comes at a cost – we assume household definitions of family too often.

We need formal, legal categories for things like incest laws and hospital rights, and the categories take on their own power. (Note there are young adult semi-step siblings with semi-together parents living together some of the time or not wondering about the propriety of sexual relationships with each other.) Reality doesn’t just line up with demographic / legal / bureaucratic categories – there is a dance between them. As the Census “relationship” categories proliferate – from 6 in 1960 ago to about 16 today – people both want to create new relationships (which Nelson calls a “creative” move) and make their relationships fit within acceptable categories (like same-sex marriage).

Screenshot 2021-02-22 105117

Methods and design

The categories investigated here – sibling-like relationships among adults, temporary adult-adolescent relationships, and informal adoptions – are so very different it’s hard to see what they have in common except some language. The book doesn’t give the formal selection criteria, so it’s hard to know exactly how the boundaries around the sample were drawn.

Nelson uses a very inductive process: “Having identified respondents and created a typology, I could refine both my specific and more general research questions” (p. 11). Not how I think of designing research projects, which just shows the diversity among sociologists.

Over more than one chapter, there is an extended case study of Nicole and her erstwhile guardians Joyce and Don, who she fell in with when her poorer family of origin broke up, essentially. Fascinating story.

The book focuses on white, (mostly) straight middle class people. This is somewhat frustrating. The rationale is that they are understudied. So that’s useful, but it would be more challenging – I guess a challenge for subsequent research – to more actively analyze their White straight middle classness as part of the research.

Compared to what

A lot of insights in the book come from people comparing their fictive kin relationships to their other family or friend relationships. This raises a methodological issue: These are people with active fictive kin relationships, so it’s a tricky sample from which to draw for understanding non-fictive relationships – it’s select. It would be nice in an ideal world to have a bigger sample without restriction and ask people about all their relationships and then compare fictive and non-fictive. Understandable not to have that, but needs to be wrestled with (by people doing future research).

Nelson establishes that the sibling-like relationships are neither like friendships nor like family, a third category. But that’s just for these people. Maybe people without fictive kin like this have family or friend relationships that look just like this in terms of reciprocity, obligation, closeness, etc. (Applies especially to the adult-sibling-like relationships.)

Modern contingency

Great insight with regard to adult “like-sibling” relationships: It’s not just that they are not as close as “family,” it’s that they are not “like family” in the sense of “baggage,” they don’t have that “tarnished reality” – and in that sense they are like the way family relationships are moving, more volitional and individualized and contingent.

Does this research show that family relationships generally in a post-traditional era are fluid and ambiguous and subject to negotiation and choice? It’s hard to know how to read this without comparison families. But here’s a thought. John, who co-parents a teenage child named Ricky, says, “To me family means somebody is part of your life that you are committed to. You don’t have to like everything about them, but whatever they need, you’re willing to give them, and if you need something, you’re willing to ask them, and you’re willing to accept if they can or can’t give it to you” (p. 130). It’s an ideal. Is it a widespread ideal? What if non-fictive family members don’t meet that ideal? The implication may be they aren’t your family anymore. Which could be why we are seeing so many people rupturing their family of origin relationships, especially young adults breaking up with their parents.

It reminds me of what happened with marriage half a century ago, where people set a high standard, and defined relationships that didn’t meet it as “not a marriage.” Or when people say abusive families aren’t really families. Conservatives hate this, because it means you can “just” walk away from bad relationships. There are pros and cons to this view.

Nelson writes at the end of the chapter on informal parents, “The possibility is always there that either party will, at some point in the near or distant future, make a different choice. That is both the simple delight and the heartrending anxiety of these relationships” (p. 133). We can’t know, however, how unique such feelings are to these relationships – I suspect not that much. This sounds so much like Anthony Giddens and the “pure” relationships of late modernity.

This contingency comes up a few times, and I always have the same question. Nelson writes in the conclusion, “Those relationships feel lighter, more buoyant, more simply based in deep-seated affection than do those they experience with their ‘real’ kin.” But that tells us how these people feel about real kin, not how everyone does. It raises a question for future research. Maybe outside this population lots of people feel the same way about their “real” kin (ask the growing number of parents who have been “unfriended” by their adult children).

I definitely recommend this book, to read, teach, and use to think about future research.

Note: In the discussion Nelson replied that most people have active fictive-kin relationships, so this sample is not so select in that respect.

Data analysis shows Journal Impact Factors in sociology are pretty worthless

The impact of Impact Factors

Some of this first section is lifted from my blockbuster report, Scholarly Communication in Sociology, where you can also find the references.

When a piece of scholarship is first published it’s not possible to gauge its importance immediately unless you are already familiar with its specific research field. One of the functions of journals is to alert potential readers to good new research, and the placement of articles in prestigious journals is a key indicator.

Since at least 1927, librarians have been using the number of citations to the articles in a journal as a way to decide whether to subscribe to that journal. More recently, bibliographers introduced a standard method for comparing journals, known as the journal impact factor (JIF). This requires data for three years, and is calculated as the number of citations in the third year to articles published over the two prior years, divided by the total number of articles published in those two years.

For example, in American Sociological Review there were 86 articles published in the years 2017-18, and those articles were cited 548 times in 2019 by journals indexed in Web of Science, so the JIF of ASR is 548/86 = 6.37. This allows for a comparison of impact across journals. Thus, the comparable calculation for Social Science Research is 531/271 = 1.96, and it’s clear that ASR is a more widely-cited journal. However, comparisons of journals in different fields using JIFs is less helpful. For example, the JIF for the top medical journal, New England Journal of Medicine, is currently 75, because there are many more medical journals publishing and citing more articles at higher rates, and more quickly than do sociology journals. (Or maybe NEJM is just that much more important.)

In addition to complications in making comparisons, there are problems with JIFs (besides the obvious limitation that citations are only one possible evaluation metric). They depend on what journals and articles are in the database being used. And they mostly measure short-term impact. Most important for my purposes here, however, is that they are often misused to judge the importance of articles rather than journals. That is, if you are a librarian deciding what journal to subscribe to, JIF is a useful way of knowing which journals your users might want to access. But if you are evaluating a scholar’s research, knowing that they published in a high-JIF journal does not mean that their article will turn out to be important. It is especially wrong to look at an article that’s old enough to have citations you could count (or not) and judge its quality by the journal it’s published in — but people do that all the time.

To illustrate this, I gathered citation data from the almost 2,500 articles published in 2016-2019 in 15 sociology journals from the Web of Science category list.* In JIF these rank from #2 (American Sociological Review, 6.37) to #46 (Social Forces, 1.95). I chose these to represent a range of impact factors, and because they are either generalist journals (e.g., ASR, Sociological Science, Social Forces) or sociology-focused enough that almost any article they publish could have been published in a generalist journal as well. Here is a figure showing the distribution of citations to those articles as of December 2020, by journal, ordered from higher to lower JIF.

After ASR, Sociology of Education, and American Journal of Sociology, it’s hard to see much of a slope here. Outliers might be playing a big role (for example that very popular article in Sociology of Religion, “Make America Christian Again: Christian Nationalism and Voting for Donald Trump in the 2016 Presidential Election,” by Whitehead, Perry, and Baker in 2018). But there’s a more subtle problem, which is the timing of the measures. My collection of articles is 2016-2019. The JIFs I’m using are from 2019, based on citations to 2017-2018 articles. These journals bounce around; for example, Sociology of Religion jumped from 1.6 to 2.6 in 2019. (I address that issue in the supplemental analysis below.) So what is a lazy promotion and tenure committee, which is probably working off a mental reputation map at least a dozen years old, to do?

You can already tell where I’m going with this: In these sociology journals, there is so much noise in citation rates within the journals, compared to any stable difference between them, that outside the very top the journal ranking won’t much help you predict how much a given paper will be cited. If you assume a paper published in AJS will be more important than one published in Social Forces, you might be right, but if the odds that you’re wrong are too high, you just shouldn’t assume anything. Let’s look closer.

Sociology failure rates

I recently read this cool paper (also paywalled in the Journal of Informetrics) that estimates the odds of this “failure probability,” the odds that your guess about which paper will be more impactful based on the journal title turns out to be wrong. When JIFs are similar, the odds of an error are very high, like a coin flip. “In two journals whose JIFs are ten-fold different, the failure probability is low,” Brito and Rodríguez-Navarro conclude. “However, in most cases when two papers are compared, the JIFs of the journals are not so different. Then, the failure probability can be close to 0.5, which is equivalent to evaluating by coin flipping.”

Their formulas look pretty complicated to me, so for my sociology approach I just did it by brute force (or if you need tenure you could call it a Monte Carlo approach). I randomly sampled 100,000 times from each possible pair of journals, then calculated the percentage of times the article with more citations was from a journal with a higher impact factor. For example, in 100,000 comparisons of random pairs sampled from ASR and Social Forces (the two journals with the biggest JIF spread), 73% of the time the ASR article had more citations.

Is 73% a lot? It’s better than a coin toss, but I’d hate to have a promotion or hiring decision be influenced by an instrument that blunt. Here are results of the 10.5 million comparisons I made (I love computers). Click to enlarge:

Outside of the ASR column, these are very bad; in the ASR column they’re pretty bad. For example, a random article from AJS only has more citations than one from the 12 lower-JIF journals 59% of the time. So if you’re reading CVs, and you see one candidate with a two-year old AJS article and one with a two-year-old Work & Occupations article, what are you supposed to do? You could compare the actual citations the two articles have gotten, or you could assess their quality of impact some other way. You absolutely should not just skim the CV and assume the AJS article is or will be more influential based on the journal title alone; the failure probability of that assumption is too high.

On my table you can also see some anomalies, of the kind which plague this system. See all that brown in the BJS and Sociology of Religion columns? That’s because both of those journals had sudden increases in their JIF, so their more recent articles have more citations, and most of the comparisons in this table (like in your memory, probably) are based on data from a few years before that. People who published in these journals three years ago are today getting an undeserved JIF bounce from having these titles on their CVs. (See the supplemental analysis below for more on this.)

Conclusion

Using JIF to decide which papers in different sociology journals are likely to be more impactful is a bad idea. Of course, lots of people know JIF is imperfect, but they can’t help themselves when evaluating CVs for hiring or promotion. And when you show them evidence like this, they might say “but what is the alternative?” But as Brito & Rodríguez-Navarro write: “if something were wrong, misleading, and inequitable the lack of an alternative is not a cause for continuing using it.” These error rates are unacceptably high.

In sociology most people won’t own up to relying on impact factors, but most people (in my experience) do judge research by where it’s published all the time. If there is a very big difference in status — enough to be associated with an appreciably different acceptance rate, for example — that’s not always wrong. But it’s a bad default.

In 2015 the biologist Michael Eisen suggested that tenured faculty should remove the journal titles from their CVs and websites, and just give readers the title of the paper and a link to it. He’s done it for his lab’s website, and I urge you to look at it just to experience the weightlessness of an academic space where for a moment overt prestige and status markers aren’t telling you what to think. I don’t know how many people have taken him up on it. I did it for my website, with the explanation, “I’ve left the titles off the journals here, to prevent biasing your evaluation of the work before you read it.” Whatever status I’ve lost I’ve made up for in virtue-signaling self-satisfaction — try it! (You can still get the titles from my CV, because I feel like that’s part of the record somehow.)

Finally, I hope sociologists will become more sociological in their evaluation of research — and of the systems that disseminate, categorize, rank, and profit from it.

Supplemental analysis

The analysis thus far is, in my view, a damning indictment of real-world reliance on the Journal Impact Factor for judging articles, and thus the researchers who produce them. However, it conflates two problems with the JIF. First is the statistical problem of imputing status from an aggregate to an individual, when the aggregate measure fails to capture variation that is very wide relative to the difference between groups. Second, more specific to JIF, is the reliance on a very time-specific comparison: citations in year three to publications in years one and two. Someone could do (maybe already has) an analysis to determine the best lag structure for JIF to maximize its predictive power, but the conclusions from the first problem imply that’s a fool’s errand.

Anyway, in my sample the second problem is clearly relevant. My analysis relies strictly on the rank-ordering provided by the JIF to determine whether article comparisons succeed or fail. However, the sample I drew covers four years, 2016-2019, and counts citations to all of them through 2020. This difference in time window produces a rank ordering that differs substantially (the rank order correlation is .73), as you can see:

In particular, three journals (BJS, SOR, and SFO) moved more than five spots in the ranking. A glance at the results table above shows that these journals are dragging down the matching success rate. To pull these two problems apart, I repeated the analysis using the ranking produced within the sample itself.

The results are now much more straightforward. First, here is the same box plot but with the new ordering. Now you can see the ranking more clearly, though you still have to squint a little.

And in the match rate analysis, the result is now driven by differences in means and variances rather than by the mismatch between JIF and sample-mean rankings (click to enlarge):

This makes a more logical pattern. The most differentiated journal, ASR, has the highest success rate, and the journals closest together in the ranking fail the most. However, please don’t take from this that such a ranking becomes a legitimate way to judge articles. The overall average on this table is still only 58%, up only 4 points from the original table. Even with a ranking that more closely conforms to the sample, this confirms Brito and Rodríguez-Navarro’s conclusion: “[when rankings] of the journals are not so different … the failure probability can be close to 0.5, which is equivalent to evaluating by coin flipping.”

These match numbers are too low to responsibly use in such a way. These major sociology journals have citation rates that are too variable, and too similar at the mean, to be useful as a way to judge articles. ASR stands apart, but only because of the rest of the field. Even judging an ASR paper against its lower-ranked competitors produces a successful one-to-one ranking of papers just 72% of the time — and that only rises to 82% with the least-cited journal on the list.

The supplemental analysis is helpful for differentiating the multiple problems with JIF, but it does nothing to solve the problem of using journal citation rates to evaluate individual articles.


*The data and Stata code I used is up here: osf.io/zutws. This includes the lists of all articles in the 15 journals from 2016 to 2020 and their citation counts as of the other day (I excluded 2020 papers from the analysis, but they’re in the lists). I forgot to save the version of the 100k-case random file that I used to do this, so I guess that can never be perfectly replicated; but you can probably do it better anyway.

COVID-19 mortality rates by race/ethnicity and age

Why are there such great disparities in COVID-19 deaths across race/ethnic groups in the U.S.? Here’s a recent review from New York City:

The racial/ethnic disparities in COVID-related mortality may be explained by increased risk of disease because of difficulty engaging in social distancing because of crowding and occupation, and increased disease severity because of reduced access to health care, delay in seeking care, or receipt of care in low-resourced settings. Another explanation may be the higher rates of hypertension, diabetes, obesity, and chronic kidney disease among Black and Hispanic populations, all of which worsen outcomes. The role of comorbidity in explaining racial/ethnic disparities in hospitalization and mortality has been investigated in only 1 study, which did not include Hispanic patients. Although poverty, low educational attainment, and residence in areas with high densities of Black and Hispanic populations are associated with higher hospitalizations and COVID-19–related deaths in NYC, the effect of neighborhood socioeconomic status on likelihood of hospitalization, severity of illness, and death is unknown. COVID-19–related outcomes in Asian patients have also been incompletely explored.

The analysis, interestingly, found that Black and Hispanic patients in New York City, once hospitalized, were less likely to die than White patients were. Lots of complicated issues here, but some combination of exposure through conditions of work, transportation, and residence; existing health conditions; and access to and quality of care. My question is more basic, though: What are the age-specific mortality rates by race/ethnicity?

Start tangent on why age-specific comparisons are important. In demography, breaking things down by age is a basic first-pass statistical control. Age isn’t inherently the most important variable, but (1) so many things are so strongly affected by age, (2) so many groups differ greatly in their age compositions, and (3) age is so straightforward to measure, that it’s often the most reasonable first cut when comparison groups. Very frequently we find that a simple comparison is reversed when age is controlled. Consider a classic example: mortality in a richer country (USA) versus a poorer country (Jordan). People in the USA live four years longer, on average, but Americans are more than twice as likely to die each year (9 per 1,000 versus 4 per 1000). The difference is age: 23% of Americans are over age 60, compared with 6% of Jordanians. More old people means more total deaths, but compare within age groups and Americans are less likely to die. A simple separation by age facilitates more meaningful comparison for most purposes. So that’s how I want to compare COVID-19 mortality across race/ethnic groups in the USA. End tangent.

Age-specific mortality rates

It seems like this should be easier, but I can’t find anyone who is publishing them on an ongoing basis. The Centers for Disease Control posts a weekly data file of COVID-19 deaths by age and race/ethnicity, but they do not include the population denominators that you need to calculate mortality rates. So, for example, it tells you that as of December 5 there have been 2,937 COVID-19 deaths among non-Hispanic Blacks in the age range 30-49, compared with 2,186 deaths among non-Hispanic Whites of the same age. So, a higher count of Black deaths. But it doesn’t tell you there are 4.3-times as many Whites as Blacks in that category. So a much higher mortality rate.

On a different page, they report the percentage of all deaths in each age range that have occurred in each race/ethnic group, don’t include their percentage in the population. So, for example, 36% of the people ages 30-39 who have died from COVID-19 were Hispanic, and 24% were non-Hispanic White, but that’s not enough information to calculate mortality rates either. I have no reason to think this is nefarious, but it’s clearly not adequate.

So I went to the 2019 American Community Survey (ACS) data distributed by IPUMS.org to get some denominators. These are a little messy for two main reasons. First, ACS is a survey that asks people what their race and ethnicity are, while death counts are based on death certificates, for which the person who has died is not available to ask. So some people will be identified with a different group when they die than they would if they were surveyed. Second, the ACS and other surveys allow people to specify multiple races (in addition to being Hispanic or not), whereas death certificate data generally does not. So if someone who identifies as Black-and-White on a survey dies, how will the death certificate read? (If you’re very interested, here’s a report on the accuracy of death certificates, and here are the “bridges” they use to try to mash up multiple-race and single-race categories.)

My solution to this is make denominators more or less the way race/ethnicity was defined before multiple race identification was allowed. I put all Hispanic people, regardless of race, into the Hispanic group. Then I put people who are White, non-Hispanic, and no other race into the White category. And then for the Black, Asian, and American Indian categories, I include people who were multiple race (and not Hispanic). So, for example, a Black-White non-Hispanic person is counted as Black. A Black-Asian non-Hispanic person is counted as both Black and Asian. Note I did also do the calculations for Native Hawaiian and Other Pacific Islanders, but those numbers are very small so I’m not showing them on the graph; they’re on the spreadsheet. Note also I say “American Indian” to include all those who are “non-Hispanic American Indian or Alaska Native.”

This is admittedly crude, but I suggest that you trust me that it’s probably OK. (Probably OK, that is, especially for Whites, Blacks, and Hispanics. American Indians and Asians have higher rates of multiple-race identification among the living, so I expect there would be more slippage there.)

Anyway, here’s the absolutely egregious result:

This figure allows race/ethnicity comparisons within the five age groups (under 30 isn’t shown). It reveals that the greatest age-specific disparities are actually at the younger ages. In the range 30-49, Blacks are 5.6-times more likely to die, and Hispanics are 6.6-times more likely to die, than non-Hispanic Whites are. In the oldest age group, over 85, where death rates for everyone are highest, the disparities are only 1.5- and 1.4-to-1 respectively.

Whatever the cause of these disparities, this is just the bottom line, which matters. Please note how very high these rates are at old ages. These are deaths per 100,000, which means that over age 85, 1.8% of all African Americans have died of COVID-19 this year (and 1.7% for Hispanics and 1.2% for Whites). That is — I keep trying to find words to convey the power of these numbers — one out of every 56 African Americans over age 85.

Please stay home if you can.

A spreadsheet file with the data, calculations, and figure, is here: https://osf.io/ewrms/.

Measuring inequality, and what the Gini index does (video)

I produced a short video on measuring inequality, focusing on the construction of the Gini index, the trend in US family inequality, and an example of using it to measure world inequality. It’s 15 minutes, intended for intro-level sociology students.

I like teaching this not because so many of my students end up calculating and analyzing Gini indexes, but because it’s a readily interpretable example of the value of condensing a lot of numbers down to one useful one — which opens up the possibility of the kind of analysis we want to do (Going up? Going down? What about France? etc.). It also helps introduce the idea that social students of inequality are systematic and scientific, and fun for people who like math, too.

The video is below, or you can watch it (along with my other videos) on YouTube. The slides are available here, including one I left out of the video, briefly discussing Corrado Gini and his bad (fascist, eugenicist) politics. Comments welcome.

Framing social class with sample selection

A lot of qualitative sociology makes comparisons across social class categories. Many researchers build class into their research designs by selecting subjects using broad criteria, most often education level, income level, or occupation. Depending on the set of questions at hand, the class selection categories will vary, focusing on, for example, upbringing and socialization, access to resources, or occupational outlook.

In the absence of a substantive review, here are a few arbitrarily selected examplar books from my areas of research:

This post was inspired by the question Caitlyn Collins asked the other day on Twitter:

She followed up by saying, “Social class is nebulous, but precision here matters to make meaningful claims. What do we mean when we say we’re talking to poor, working class, middle class, wealthy folks? I’m looking for specific demographic questions, categories, scales sociologists use as screeners.” The thread generated a lot of good ideas.

Income, education, occupation

Screening people for research can be costly and time consuming, so you want to maximize simplicity as well as clarity. So here’s a way of looking at some common screening variables, and what you might get or lose by relying on them in different combinations. This uses the 2018 American Community Survey, provided by IPUMS.org (Stata data file and code here).

  • I used income, education, and occupation to identify the status of individuals, and generated household class categories by the presence of absence of types of people in each. That means everyone in each household is in the same class category (a choice you might or might not want to make).
  • Income: Total household income divided by an equivalency scale (for cost of living). The scale counts each adult as 1 person, each child under 18 as .70, and then scales that count by ^.70. I divided the resulting distribution into thirds, so households are in the top, middle, or bottom third. Top third is what I called “middle/upper” class, bottom third is “lower class.”
  • Education: I use BA degree to identify households that have (middle/upper) or don’t (lower) a four-year college graduate present. This is 31% of adults.
  • Occupation: I used the 2018 ACS occupation codes, and coded people as middle/upper class if their codes was 10 to 3550, which are management, business, and financial occupations; computer, engineering, and science occupations; education, legal, community service, arts, and media occupations; and healthcare practitioners and technical occupations. It’s pretty close to what we used to call “managerial and professional” occupations. Together, these account for 37% of workers.

So each of these three variables identifies an upper/middle class status of about a third of people.

For lower class status, you can just reverse them. The except is income, which is in three categories. For that, I counted households as lower class if their household income was in the bottom third of the adjusted distribution. In the figures below, that means they’re neither middle/upper class nor lower class if they’re in the middle of the income distribution. This is easily adjusted.

Venn diagrams

You can make Venn diagrams in Stata using the pvenn2 add-on, which I naturally discovered after making these. If  you must know, made these by generating tables in Stata, downloading this free plotter app, entering the values manually, copying the resulting figures into Powerpoint and applying the text there, then printing them to PDF, and extracting the images from PDF using Photoshop. Not recommended workflow.

Here they are. I hope the visuals might help people think about for example, who they might get if they screened on just one of these variables, or how unusual someone is who has a high income or occupation but no BA, and so on. But draw your own conclusions (and feel free to modify the code and follow your own approach). Click to enlarge.

First middle/upper class:

Venn diagram of overlapping class definitions

Then lower class:

Venn diagram of overlapping class definitions.

I said draw your own conclusions, but please don’t draw the conclusion that I think this is the best way to define social class. That’s a whole different question. This is just about simply ways to select people to be research subjects. For other posts on social class, follow this tag, which includes this post about class self identification by income and race/ethnicity.


Data and code: osf.io/w2yvf/

Divorce fell in one Florida county (and 31 others), and you will totally believe what happened next

You can really do a lot with the common public misperception that divorce is always going up. Brad Wilcox has been taking advantage of that since at least 2009, when he selectively trumpeted a decline in divorce (a Christmas gift to marriage) as if it was not part of an ongoing trend.

I have reported that the divorce rate in the U.S. (divorces per married woman) fell 21 percent from 2008 to 2017.  And yet yesterday, Faithwire’s Will Maule wrote, “With divorce rates rocketing across the country, it can be easy to lose a bit of hope in the God-ordained bond of marriage.”

Anyway, now there is hope, because, as right-wing podcaster Lee Habeeb wrote in Newsweek, THE INCREDIBLE SUCCESS STORY BEHIND ONE COUNTY’S PLUMMETING DIVORCE RATE SHOULD INSPIRE US ALL. In fact, we may be on the bring of Reversing Social Disintegration, according to Seth Kaplan, writing in National Affairs. That’s because of the Culture of Freedom Initiative of the Philanthropy Roundtable (a right-wing funding aggregator run by people like Art Pope, Betsy Devos, the Bradley Foundation, the Hoover Institution, etc.), which has now been spun off as Cummunio, a marriage ministry that uses marriage programs to support Christian churches. Writes Kaplan:

The program, which has recently become an independent nonprofit organization called Communio, used the latest marketing techniques to “microtarget” outreach, engaged local churches to maximize its reach and influence, and deployed skills training to better prepare individuals and couples for the challenges they might face. COFI highlights how employing systems thinking and leveraging the latest in technology and data sciences can lead to significant progress in addressing our urgent marriage crisis.

The program claims 50,000 people attended four-hour “marriage and faith strengthening programs,” and further made 20 million Internet impressions “targeting those who fit a predictive model for divorce.” So, have they increased marriage and reduced divorce? I don’t know, and neither do they, but they say they do.

Funny aside, the results website today says “Communio at work: Divorce drops 24% in Jacksonville,” but a few days ago the same web page said 28%. That’s probably because Duval County (which is what they’re referring to) just saw a SHOCKING 6% INCREASE IN DIVORCE (my phrase) in 2018 — the 10th largest divorce rate increase in all 40 counties in Florida for which data are available (see below). But anyway, that’s getting ahead of the story.

Gimme the report

The 28% result came from this report by Brad Wilcox and Spencer James, although they don’t link to it. That’s what I’ll focus on here. The report describes the many hours of ministrations, and the 20 million Internet impressions, and then gets to the heart of the matter:

We answer this question by looking at divorce and marriage trends in Duval County and three comparable counties in Florida: Hillsborough, Orange, and Escambia. Our initial data analysis suggests that the COFI effort with Live the Life and a range of religious and civic partners has had an exceptional impact on marital stability in Duval County. Since 2016, the county has witnessed a remarkable decline in divorce: from 2015 to 2017, the divorce rate fell 28 percent. As family scholars, we have rarely seen changes of this size in family trends over such a short period of time. Although it is possible that some other factor besides COFI’s intervention also helped, we think this is unlikely. In our professional opinion, given the available evidence, the efforts undertaken by COFI in Jacksonville appear to have had a marked effect on the divorce rate in Duval County.

A couple things about these very strong causal claims. First, they say nothing about how the “comparable counties” were selected. Florida seems to have 68 counties, 40 of which the Census gave me population counts for. Why not use them all? (You’ll understand why I ask when they get to the N=4 regression.) Second, how about that “exceptional impact,” the “remarkable decline” “rarely seen” in their experience as family scholars? Note there is no evidence in the report of the program doing anything, just the three year trend. And while it is a big decline, it’s one I would call “occasionally seen.” (It helps to know that divorce is generally going down — something the report never mentions.)

To put the decline in perspective, first a quick national look. In 2009 there was a big drop in divorce, accelerating the ongoing decline, presumably related to the recession (analyzed here). It was so big that nine states had crude divorce rate declines of 20% or more in that one year alone. Here is what 2008-2009 looked like:

state divorce changes 08-09.xlsx

So, a drop in divorce on this scale is not that rare in recent times. This is important background Wilcox is (comfortably) counting on his audience not knowing. So what about Florida?

Wilcox and James start with this figure, which shows the number of divorces per 1000 population in Duval County (Jacksonville), and the three other counties:wj1

Again, there is no reason given for selecting these three counties. To test the comparison, which evidently shows a faster decline in Duval, they perform two regression models. (To their credit, James shared their data with me when I requested it — although it’s all publicly available this was helpful to make sure I was doing it the same way they did.) First, I believe they ran a regression with an N of 4, the dependent variable being the 2014-2017 decline in divorce rate, and the independent variable being a dummy for Duval. I share the complete dataset for this model here:

div_chg duval
1. -1.116101 1
2. -0.2544951 0
3. -0.3307687 0
4. -0.5048307 0

I don’t know exactly what they did with the second model, which must somehow how have a larger sample than 4 because it has 8 variables. Maybe 16 county-years? Anyway, doesn’t much matter. Here is their table:

wj2

How to evaluate a faster decline among a general trend toward lower divorce rates? If you really wanted to know if the program worked, you would have to study the program, people who were in the program and people who weren’t and so on. (See this writeup of previous marriage promotion disasters, studied correctly, for a good example.) But I’m quite confident that this conclusion is ridiculous and irresponsible: “In our professional opinion, given the available evidence, the efforts undertaken by COFI in Jacksonville appear to have had a marked effect on the divorce rate in Duval County.” No one should take such a claim seriously except as a reflection on the judgment or motivations of its author.

Because the “comparison counties” was bugging me, I got the divorce counts from Florida’s Vital Statistics office (available here), and combined them with Census data on county populations (table S0101 on census.data.gov). Since 2018 has now come out, I’m showing the change in each county’s crude divorce rate from 2015, before Communio, through 2018.

florida divorce counties.xlsx

You can see that Duval has had a bigger drop in divorce than most Florida counties — 32 of which saw divorce rates fall in this period. Of the counties that had bigger declines, Monroe and Santa Rosa are quite small, but Lake County is mid-sized (population 350,000), and bigger than Escambia, which is one of the comparison counties. How different their report could have been with different comparison cases! This is why it’s a good idea to publicly specify your research design before you collect your data, so people don’t suspect you of data shenanigans like goosing your comparison cases.

What about that 2018 rebound? Wilcox and James stopped in 2017. With the 2018 data we can look further. Eighteen counties had increased divorce rates in 2018, and Duval’s was large at 6%. Two of the comparison cases (Hillsborough and Escambria) had decreases in divorce, as did the state’s largest county, Miami-Dade (down 5%).

To summarize, Duval County had a larger than average decline in divorce rates in 2014-2017, compared with the rest of Florida, but then had a larger-than-average increase in 2018. That’s it.

Marriage

Obviously, Communio wants to see more marriage, too, but here not even Wilcox can turn the marriage frown upside down.

wj5

Why no boom in marriage, with all those Internet hits and church sessions? They reason:

This may be because the COFI effort did not do much to directly promote marriage per se (it focused on strengthening existing marriages and relationships), or it may be because the effort ended up encouraging Jacksonville residents considering marriage to proceed more carefully. One other possibility may also help explain the distinctive pattern for Duval County. Hurricane Irma struck Jacksonville in September of 2017; this weather event may have encouraged couples to postpone or relocate their weddings.

OK, got it — so they totally could have increased marriage if they had wanted to. Except for the hurricane. I can’t believe I did this, but I did wonder about the hurricane hypothesis. Here are the number of marriages per month in Duval County, from 13 months before Hurrican Irma (September 2017), to 13 months after, with Septembers highlighted.

jacksonville marriges.xlsx

There were fewer marriages in September 2017 than 2016, 51 fewer, but September is a slow month anyway. And they almost made up for it with a jump in December, which could be hurricane-related postponements. But then the following September was no better, so this hypothesis doesn’t look good. (Sheesh, how much did they get paid to do this report? I’m not holding back any of the analysis here.)

Aside: Kristen & Jessica had a beautiful wedding in Jacksonville just a few days after Hurricane Irma. Jessica recalled, “Hurricane Irma hit the week before our wedding, which damaged our venue pretty badly. As it was outdoors on the water, there were trees down all over the place and flooding… We were very lucky that everything was cleaned up so fast. The weather the day of the wedding turned out to be perfect!” I just had to share this picture, for the Communio scrapbook:

Portraits-0092-1024x682
Photo by Jazi Davis in JaxMagBride.

So, to recap: Christian philanthropists and intrepid social scientists have pretty much reversed social disintegration and the media is just desperate to keep you from finding out about it.

Also, Brad Wilcox lies, cheats, and steals. And the people who believe in him, and hire him to carry their social science water, don’t care.

Do rich people like bad data tweets about poor people? (Bins, slopes, and graphs edition)

Almost 2,000 people retweeted this from Brad Wilcox the other day.

bradpoorstv

Brad shared the graph from Charles Lehman (who noticed later that he had mislabeled the x-axis, but that’s not the point). First, as far as I can tell the values are wrong. I don’t know how they did it, but when I look at the 2016-2018 General Social Survey, I get 4.3 average hours of TV for people in the poorest families, and 1.9 hours for the richest. They report higher highs (looks like 5.3) and lower lows (looks like 1.5). More seriously, I have to object to drawing what purports to be a regression line as if those are evenly-spaced income categories, which makes it look much more linear than it is.

I fixed those errors — the correct values, and the correct spacing on the x-axis — then added some confidence intervals, and what I get is probably not worth thousands of self-congratulatory woots, although of course rich people do watch less TV. Here is my figure, with their line (drawn in by hand) for comparison:

tvfaminc-bradcharles

Charles and Brad’s post got a lot of love from conservatives, I believe, because it confirmed their assumptions about self-destructive behavior among poor people. That is, here is more evidence that poor people have bad habits and it’s just dragging them down. But there are reasons this particular graph worked so well. First, the steep slope, which partly results from getting the data wrong. And second, the tight fit of the regression line. That’s why Brad said, “Whoa.” So, good tweet — bad science. (Surprise.) Here are some critiques.

First, this is the wrong survey to use. Since 1975, GSS has been asking people, “On the average day, about how many hours do you personally watch television?” It’s great to have a continuous series on this, but it’s not a good way to measure time use because people are bad at estimating these things. Also, GSS is not a great survey for measuring income. And it’s a pretty small sample. So if those are the two variables you’re interested in, you should use the American Time Use Survey (available from IPUMS), in which respondents are drawn from the much larger Current Population Survey samples, and asked to fill out a time diary. On the other hand, GSS would be good for analyzing, for example, whether people who believe the Bible is the “the actual word of God and is to be taken literally, word for word” watch TV more than those who believe it is “an ancient book of fables, legends, history, and moral precepts recorded by men” (Yes, they do, about an hour more.) Or looking at all the other social variables GSS is good for.

On the substantive issue, Gray Kimbrough pointed out that the connection between family income and TV time may be spurious, and is certainly confounded with hours spent at work. When I made a simple regression model of TV time with family income, hours worked, age, sex, race/ethnicity, education, and marital status (which again, should be done better with ATUS), I did find that both hours worked and family income had big effects. Here they are from that model, as predicted values using average marginal effects.

tv work faminc

The banal observation that people who spend more time working spend less time watching TV probably wouldn’t carry the punch. Anyway, neither resolves the question of cause and effect.

Fits and slopes

On the issue of the presentation of slopes, there’s a good lesson here. Data presentation involves trading detail for clarity. And statistics have both have a descriptive and analytical purpose. Sometimes we use statistics to present information in simplified form, which allows better comprehension. We also use statistics to discover relationships we couldn’t otherwise — such as multivariate relationships that you can’t discern visually. The analyst and communicator has to choose wisely what to present. A good propagandist knows what to manipulate for political effect (a bad one just tweets out crap until they get lucky).

Here’s a much less click-worthy presentation of the relationship between family income and TV time. Here I truncate the y-axis at 12 hours (cutting off 1% of the sample), translate the binned income categories into dollar values at the middle of each category, and then jitter the scatterplot so you can see how many points are piled up in each spot. The fitted line is Stata’s median spline, with 9 bands specified (so it’s the median hours at the median income in 9 locations on the x-axis). I guess this means that, at the median, rich people in America watch about an hour of TV per day less than poor people, and the action is mostly under $50,000 per year. Woot.

gss tv income

Finally, a word about binning and the presentation of data (something I’ve written about before, here and here). We make continuous data into categories all the time, starting from measurement. We usually measure age in years, for example, although we could measure it in seconds or decades. Then we use statistics to simplify information further, for example by reporting averages. In the visual presentation of data, there is a particular problem with using averages or data bins to show relationships — you can show slopes that way nicely, but you run the risk of making relationships look more closely correlated than they are. This happens in the public presentation of data when analysts are showing something of their work product — such as a scatterplot with a fitted line — to demonstrate the veracity of their findings. When they bin the data first, this can be very misleading.

Here’s an example. I took about 1000 men from the GSS, and compared their age and income. Between the ages of 25 and 59, older men have higher average incomes, but the fit is curved with a peak around 45. Here is the relationship, again using jittering to show all the individuals, with a linear regression line. The correlation is .23

c1That might be nice to look at but it’s hard to see the underlying relationship. It’s hard to even see how the fitted line relates to the data. So you might reduce it by showing the average income at each age. By pulling the points together vertically into average bins, this shows the relationship much more clearly. However, it also makes the relationship look much stronger. The correlation in this figure is .65. Now the reader might think, “Whoa.”

c2Note this didn’t change the slope much (it still runs from about $30k to $60k), it just put all the dots closer to the line. Finally, here it is pulling the averages together in horizontal bins, grouping the ages in fives (25-29, 30-34 … 55-59). The correlation shown here is .97.

c3

If you’re like me, this is when you figured out that reducing this to two dots would produce a correlation of 1.0 (as long as the dots aren’t exactly level).

To make good data presentation tradeoffs requires experimentation and careful exposition. And, of course, transparency. My code for this post is available on the Open Science Framework here (you gotta get the GSS data first).

Decadally-biased marriage recall in the American Community Survey

Do people forget when they got married?

In demography, there is a well-known phenomenon known as age-heaping, in which people round off their ages, or misremember them, and report them as numbers ending in 0 or 5. We have a measure, known as Whipple’s index, that estimates the extent to which this is occurring in a given dataset. To calculate this you take the number of people between ages 23 and 62 (inclusive), and compare it to five-times the number of those whose ages end in 0 or 5 (25, 30 … 60), so there are five-times as many total years as 0 and 5 years.

If the ratio of 0/5s to the total is less than 105, that’s “highly accurate” by the United Nations standard, a ratio 105 to 110 is “fairly accurate,” and in the range 110 to 125 age data should be considered “approximate.”

I previously showed that the American Community Survey’s (ACS) public use file has a Whipple index of 104, which is not so good for a major government survey in a rich country. The heaping in ACS apparently came from people who didn’t respond to email or mail questionnaires and had to be interviewed by Census Bureau staff by phone or in person. I’m not sure what you can do about that.

What about marriage?

The ACS has a great data on marriage and marital events, which I have used to analyze divorce trends, among other things. Key to the analysis of divorce patterns is the question, “When was this person last married?” (YRMARR) Recorded as a year date, this allows the analyst to take into account the duration of marriage preceding divorce or widowhood, the birth of children, and so on. It’s very important and useful information.

Unfortunately, it may also have an accuracy problem.

I used the ACS public use files made available by IPUMS.org, combining all years 2008-2017, the years they have included the variable YRMARR. The figure shows the number of people reported to have last married in each year from 1936 to 2015. The decadal years are highlighted in black. (The dropoff at the end is because I included surveys earlier than those years.)

year married in 2016.xlsx

Yikes! That looks like some decadal marriage year heaping. Note I didn’t highlight the years ending in 5, because those didn’t seem to be heaped upon.

To describe this phenomenon, I hereby invent the Decadally-Biased Marriage Recall index, or DBMR. This is 10-times the number of people married in years ending in 0, divided by the number of people married in all years (starting with a 6-year and ending with a 5-year). The ratio is multiplied by 100 to make it comparable to the Whipple index.

The DBMR for this figure (years 1936-2015) is 110.8. So there are 1.108-times as many people in those decadal years as you would expect from a continuous year function.

Maybe people really do get married more in decadal years. I was surprised to see a large heap at 2000, which is very recent so you might think there was good recall for those weddings. Maybe people got married that year because of the millennium hoopla. When you end the series at 1995, however, the DBMR is still 110.6. So maybe some people who would have gotten married at the end of 1999 waited till New Years day or something, or rushed to marry on New Year’s Eve 2000, but that’s not the issue.

Maybe this has to do with who is answering the survey. Do you know what year your parents got married? If you answered the survey for your household, and someone else lives with you, you might round off. This is worth pursuing. I restricted the sample to just those who were householders (the person in whose name the home is owned or rented), and still got a DBMR of 110.7. But that might not be the best test.

Another possibility is that people who started living together before they were married — which is most Americans these days — don’t answer YRMARR with their legal marriage date, but some rounded-off cohabitation date. I don’t know how to test that.

Anyway, something to think about.

Theology majors marry each other a lot, but business majors don’t (and other tales of BAs and marriage)

The American Community Survey collects data on the college majors of people who’ve graduated college. This excellent data has lots of untapped potential for family research, because it tells us something about people’s character and experience that we don’t have from any other variables in this massive annual dataset. (It even asks about a second major, but I’m not getting into that.)

To illustrate this, I did two data exercises that combine college major with marital events, in this case marriage. Looking at people who just married in the previous year, and college major, I ask: Which majors are most and least likely to marry each other, and which majors are most likely to marry people who aren’t college graduates?

I combined eight years of the ACS (2009-2016), which gave me a sample of 27,806 college graduates who got married in the year before they were surveyed (to someone of the other sex). Then I cross-tabbed the major of wife and major of husband, and produced a table of frequencies. To see how majors marry each other, I calculated a ratio of observed to expected frequencies in each cell on the table.

Example: With weights (rounding here), there were a total of 2,737,000 BA-BA marriages. I got 168,00 business majors marrying each other, out of 614,000 male and 462,000 female business majors marrying altogether. So I figured the expected number of business-business pairs was the proportion of all marrying men that were business majors (.22) times the number of women that were business majors (461,904), for an expected number of 103,677 pairs. Because there were 168,163 business-business pairs, the ratio is 1.6.  (When I got the same answer flipping the genders, I figured it was probably right, but if you’ve got a different or better way of doing it, I wouldn’t be surprised!)

It turns out business majors, which are the most numerous of all majors (sigh), have the lowest tendency to marry each other of any major pair. The most homophilous major is theology, where the ratio is a whopping 31. (You have to watch out for the very small cells though; I didn’t calculate confidence intervals.) You can compare them with the rest of the pairs along the diagonal in this heat map (generated with conditional formatting in Excel):

spouse major matching

Of course, not all people with college degrees marry others with college degrees. In the old days it was more common for a man with higher education to marry a woman without than the reverse. Now that more women have BAs, I find in this sample that 35% of the women with BAs married men without BAs, compared to just 22% of BA-wielding men who married “down.” But the rates of down-marriage vary a lot depending on what kind of BA people have. So I made the next figure, which shows the proportion of male and female BAs, by major, marrying people without BAs (with markers scaled to the size of each major). At the extreme, almost 60% of the female criminal justice majors who married ended up with a man without a BA (quite a bit higher than the proportion of male crim majors who did the same). On the other hand, engineering had the lowest overall rate of down-marriage. Is that a good thing about engineering? Something people should look at!

spouse matching which BAs marry down

We could do a lot with this, right? If you’re interested in this data, and the code I used, I put up data and Stata code zips for each of these analyses (including the spreadsheet): BA matching, BA’s down-marrying. Free to use!