Tag Archives: methods

Do rich people like bad data tweets about poor people? (Bins, slopes, and graphs edition)

Almost 2,000 people retweeted this from Brad Wilcox the other day.

bradpoorstv

Brad shared the graph from Charles Lehman (who noticed later that he had mislabeled the x-axis, but that’s not the point). First, as far as I can tell the values are wrong. I don’t know how they did it, but when I look at the 2016-2018 General Social Survey, I get 4.3 average hours of TV for people in the poorest families, and 1.9 hours for the richest. They report higher highs (looks like 5.3) and lower lows (looks like 1.5). More seriously, I have to object to drawing what purports to be a regression line as if those are evenly-spaced income categories, which makes it look much more linear than it is.

I fixed those errors — the correct values, and the correct spacing on the x-axis — then added some confidence intervals, and what I get is probably not worth thousands of self-congratulatory woots, although of course rich people do watch less TV. Here is my figure, with their line (drawn in by hand) for comparison:

tvfaminc-bradcharles

Charles and Brad’s post got a lot of love from conservatives, I believe, because it confirmed their assumptions about self-destructive behavior among poor people. That is, here is more evidence that poor people have bad habits and it’s just dragging them down. But there are reasons this particular graph worked so well. First, the steep slope, which partly results from getting the data wrong. And second, the tight fit of the regression line. That’s why Brad said, “Whoa.” So, good tweet — bad science. (Surprise.) Here are some critiques.

First, this is the wrong survey to use. Since 1975, GSS has been asking people, “On the average day, about how many hours do you personally watch television?” It’s great to have a continuous series on this, but it’s not a good way to measure time use because people are bad at estimating these things. Also, GSS is not a great survey for measuring income. And it’s a pretty small sample. So if those are the two variables you’re interested in, you should use the American Time Use Survey (available from IPUMS), in which respondents are drawn from the much larger Current Population Survey samples, and asked to fill out a time diary. On the other hand, GSS would be good for analyzing, for example, whether people who believe the Bible is the “the actual word of God and is to be taken literally, word for word” watch TV more than those who believe it is “an ancient book of fables, legends, history, and moral precepts recorded by men” (Yes, they do, about an hour more.) Or looking at all the other social variables GSS is good for.

On the substantive issue, Gray Kimbrough pointed out that the connection between family income and TV time may be spurious, and is certainly confounded with hours spent at work. When I made a simple regression model of TV time with family income, hours worked, age, sex, race/ethnicity, education, and marital status (which again, should be done better with ATUS), I did find that both hours worked and family income had big effects. Here they are from that model, as predicted values using average marginal effects.

tv work faminc

The banal observation that people who spend more time working spend less time watching TV probably wouldn’t carry the punch. Anyway, neither resolves the question of cause and effect.

Fits and slopes

On the issue of the presentation of slopes, there’s a good lesson here. Data presentation involves trading detail for clarity. And statistics have both have a descriptive and analytical purpose. Sometimes we use statistics to present information in simplified form, which allows better comprehension. We also use statistics to discover relationships we couldn’t otherwise — such as multivariate relationships that you can’t discern visually. The analyst and communicator has to choose wisely what to present. A good propagandist knows what to manipulate for political effect (a bad one just tweets out crap until they get lucky).

Here’s a much less click-worthy presentation of the relationship between family income and TV time. Here I truncate the y-axis at 12 hours (cutting off 1% of the sample), translate the binned income categories into dollar values at the middle of each category, and then jitter the scatterplot so you can see how many points are piled up in each spot. The fitted line is Stata’s median spline, with 9 bands specified (so it’s the median hours at the median income in 9 locations on the x-axis). I guess this means that, at the median, rich people in America watch about an hour of TV per day less than poor people, and the action is mostly under $50,000 per year. Woot.

gss tv income

Finally, a word about binning and the presentation of data (something I’ve written about before, here and here). We make continuous data into categories all the time, starting from measurement. We usually measure age in years, for example, although we could measure it in seconds or decades. Then we use statistics to simplify information further, for example by reporting averages. In the visual presentation of data, there is a particular problem with using averages or data bins to show relationships — you can show slopes that way nicely, but you run the risk of making relationships look more closely correlated than they are. This happens in the public presentation of data when analysts are showing something of their work product — such as a scatterplot with a fitted line — to demonstrate the veracity of their findings. When they bin the data first, this can be very misleading.

Here’s an example. I took about 1000 men from the GSS, and compared their age and income. Between the ages of 25 and 59, older men have higher average incomes, but the fit is curved with a peak around 45. Here is the relationship, again using jittering to show all the individuals, with a linear regression line. The correlation is .23

c1That might be nice to look at but it’s hard to see the underlying relationship. It’s hard to even see how the fitted line relates to the data. So you might reduce it by showing the average income at each age. By pulling the points together vertically into average bins, this shows the relationship much more clearly. However, it also makes the relationship look much stronger. The correlation in this figure is .65. Now the reader might think, “Whoa.”

c2Note this didn’t change the slope much (it still runs from about $30k to $60k), it just put all the dots closer to the line. Finally, here it is pulling the averages together in horizontal bins, grouping the ages in fives (25-29, 30-34 … 55-59). The correlation shown here is .97.

c3

If you’re like me, this is when you figured out that reducing this to two dots would produce a correlation of 1.0 (as long as the dots aren’t exactly level).

To make good data presentation tradeoffs requires experimentation and careful exposition. And, of course, transparency. My code for this post is available on the Open Science Framework here (you gotta get the GSS data first).

2 Comments

Filed under In the news

Decadally-biased marriage recall in the American Community Survey

Do people forget when they got married?

In demography, there is a well-known phenomenon known as age-heaping, in which people round off their ages, or misremember them, and report them as numbers ending in 0 or 5. We have a measure, known as Whipple’s index, that estimates the extent to which this is occurring in a given dataset. To calculate this you take the number of people between ages 23 and 62 (inclusive), and compare it to five-times the number of those whose ages end in 0 or 5 (25, 30 … 60), so there are five-times as many total years as 0 and 5 years.

If the ratio of 0/5s to the total is less than 105, that’s “highly accurate” by the United Nations standard, a ratio 105 to 110 is “fairly accurate,” and in the range 110 to 125 age data should be considered “approximate.”

I previously showed that the American Community Survey’s (ACS) public use file has a Whipple index of 104, which is not so good for a major government survey in a rich country. The heaping in ACS apparently came from people who didn’t respond to email or mail questionnaires and had to be interviewed by Census Bureau staff by phone or in person. I’m not sure what you can do about that.

What about marriage?

The ACS has a great data on marriage and marital events, which I have used to analyze divorce trends, among other things. Key to the analysis of divorce patterns is the question, “When was this person last married?” (YRMARR) Recorded as a year date, this allows the analyst to take into account the duration of marriage preceding divorce or widowhood, the birth of children, and so on. It’s very important and useful information.

Unfortunately, it may also have an accuracy problem.

I used the ACS public use files made available by IPUMS.org, combining all years 2008-2017, the years they have included the variable YRMARR. The figure shows the number of people reported to have last married in each year from 1936 to 2015. The decadal years are highlighted in black. (The dropoff at the end is because I included surveys earlier than those years.)

year married in 2016.xlsx

Yikes! That looks like some decadal marriage year heaping. Note I didn’t highlight the years ending in 5, because those didn’t seem to be heaped upon.

To describe this phenomenon, I hereby invent the Decadally-Biased Marriage Recall index, or DBMR. This is 10-times the number of people married in years ending in 0, divided by the number of people married in all years (starting with a 6-year and ending with a 5-year). The ratio is multiplied by 100 to make it comparable to the Whipple index.

The DBMR for this figure (years 1936-2015) is 110.8. So there are 1.108-times as many people in those decadal years as you would expect from a continuous year function.

Maybe people really do get married more in decadal years. I was surprised to see a large heap at 2000, which is very recent so you might think there was good recall for those weddings. Maybe people got married that year because of the millennium hoopla. When you end the series at 1995, however, the DBMR is still 110.6. So maybe some people who would have gotten married at the end of 1999 waited till New Years day or something, or rushed to marry on New Year’s Eve 2000, but that’s not the issue.

Maybe this has to do with who is answering the survey. Do you know what year your parents got married? If you answered the survey for your household, and someone else lives with you, you might round off. This is worth pursuing. I restricted the sample to just those who were householders (the person in whose name the home is owned or rented), and still got a DBMR of 110.7. But that might not be the best test.

Another possibility is that people who started living together before they were married — which is most Americans these days — don’t answer YRMARR with their legal marriage date, but some rounded-off cohabitation date. I don’t know how to test that.

Anyway, something to think about.

Leave a comment

Filed under Research reports

Theology majors marry each other a lot, but business majors don’t (and other tales of BAs and marriage)

The American Community Survey collects data on the college majors of people who’ve graduated college. This excellent data has lots of untapped potential for family research, because it tells us something about people’s character and experience that we don’t have from any other variables in this massive annual dataset. (It even asks about a second major, but I’m not getting into that.)

To illustrate this, I did two data exercises that combine college major with marital events, in this case marriage. Looking at people who just married in the previous year, and college major, I ask: Which majors are most and least likely to marry each other, and which majors are most likely to marry people who aren’t college graduates?

I combined eight years of the ACS (2009-2016), which gave me a sample of 27,806 college graduates who got married in the year before they were surveyed (to someone of the other sex). Then I cross-tabbed the major of wife and major of husband, and produced a table of frequencies. To see how majors marry each other, I calculated a ratio of observed to expected frequencies in each cell on the table.

Example: With weights (rounding here), there were a total of 2,737,000 BA-BA marriages. I got 168,00 business majors marrying each other, out of 614,000 male and 462,000 female business majors marrying altogether. So I figured the expected number of business-business pairs was the proportion of all marrying men that were business majors (.22) times the number of women that were business majors (461,904), for an expected number of 103,677 pairs. Because there were 168,163 business-business pairs, the ratio is 1.6.  (When I got the same answer flipping the genders, I figured it was probably right, but if you’ve got a different or better way of doing it, I wouldn’t be surprised!)

It turns out business majors, which are the most numerous of all majors (sigh), have the lowest tendency to marry each other of any major pair. The most homophilous major is theology, where the ratio is a whopping 31. (You have to watch out for the very small cells though; I didn’t calculate confidence intervals.) You can compare them with the rest of the pairs along the diagonal in this heat map (generated with conditional formatting in Excel):

spouse major matching

Of course, not all people with college degrees marry others with college degrees. In the old days it was more common for a man with higher education to marry a woman without than the reverse. Now that more women have BAs, I find in this sample that 35% of the women with BAs married men without BAs, compared to just 22% of BA-wielding men who married “down.” But the rates of down-marriage vary a lot depending on what kind of BA people have. So I made the next figure, which shows the proportion of male and female BAs, by major, marrying people without BAs (with markers scaled to the size of each major). At the extreme, almost 60% of the female criminal justice majors who married ended up with a man without a BA (quite a bit higher than the proportion of male crim majors who did the same). On the other hand, engineering had the lowest overall rate of down-marriage. Is that a good thing about engineering? Something people should look at!

spouse matching which BAs marry down

We could do a lot with this, right? If you’re interested in this data, and the code I used, I put up data and Stata code zips for each of these analyses (including the spreadsheet): BA matching, BA’s down-marrying. Free to use!

10 Comments

Filed under Research reports

No, early marriage is not more common for college graduates

Update: IFS has taken down the report I critiqued here, and put up a revised report. They have added an editor’s note, which doesn’t mention me or link to this post:

Editor’s Note: This post is an update of a post published on March 14, 2018. The original post looked at marriage trends by education among all adults under age 25. It gave the misimpression that college graduates were more likely to be married young nowadays, compared to non-college graduates.


At the Institute for Family Studies, Director of Research Wendy Wang has a post up with the provocative title, “Early Marriage is Now More Common For College Graduates” (linking to the Internet Archive version).

She opens with this:

Getting married at a young age used to be more common among adults who didn’t go to college. But the pattern has reversed in the past decade or so. In 2016, 9.4% of college graduates ages 18 to 24 have ever been married, which is higher than the share among their peers without a college degree (7.9%), according to my analysis of the most recent Census data.

And then the dramatic conclusion:

“What this finding shows is that even at a young age, college-educated adults today are more likely than their peers without a college degree to be married. And this is new.”

That would be new, and surprising, if it were true, but it’s not.

Here’s the figure that supports the conclusion:

figure1wendyupdate-w640

It shows that 9.4% of college graduates in the age range 18-24 have been married, compared with 7.9% of those who did not graduate from college. (The drop has been faster for non-graduates, but I’m setting aside the time trend for now.) Honestly, I guess you could say, based on this, that young college graduates are more likely than non-graduates to “be married,” but not really.

The problem is there are very very few college graduates in the ages 18-19. The American Community Survey, which they used here, reports only about 12,000 in the whole country, compared with 8.7 million people without college degrees ages 18-19 (this is based on the public use files that IPUMS.org uses; which is what I use in the analysis below). Wow! There are lots and lots of non-college graduates below age 20 (including almost everyone who will one day be a college graduate!), and very few of them are married. So it looks like the marriage rate is low for the group 18-24 overall. Here is the breakdown by age and marital status for the two groups: less than BA education, and BA or higher education — on the same population scale, to help illustrate the point:

ifs1ifs2

If you pool all the years together, you get a higher marriage rate for the college graduates, mostly because there are so few college graduates in the younger ages when hardly anyone is married.

To show the whole thing in terms of marriage rates, here is the marital status for the two groups at every age from 15 (when ACS starts asking about marital status) to 54.

ifs3

Ignoring 19-21, where there are a tiny number of college graduates, you see a much more sensible pattern: college graduates delay marriage longer, but then have higher rates at older ages (starting at age 28), for all the reasons we know marriage is ultimately more common among college graduates. In fact, if you used ages 15-24 (why not?), you get an even bigger difference — with 9.4% of college graduates married and just 5.7% of non-college graduates. Why not? In fact, what about ages 0-24? It would make almost as much sense.

Another way to do this is just to look at 24-year-olds. Since we’re talking about the ever-married status, and mortality is low at these ages, this is a case where the history is implied in the cross-sectional data. At age 24, as the figure shows, 19.9% of non-college graduates have been married, compared with 12.9% of college graduates. Early marriage is not more common for college graduates.

In general, I don’t recommend comparing college graduates and non-graduates, at least in cross-sectional data, below age 25. Lots of people finishing college below age 25 (and increasingly after that age as well). There is also an important issue of endogeneity here, which always makes education and age analysis tricky. Some people (mostly women) don’t finish college because they get married and have children).

Anyway, it looks to me like someone working for a pro-marriage organization saw what seemed like a story implying marriage is good (that’s why college graduates do it, after all), and one that also fits with the do-what-I-say-not-what-I-do criticism of liberals, who are supposedly not promoting marriage among poor people while they themselves love to get married (a critique made by Charles Murray, Brad Wilcox, and others). And, before thinking it through, they published it.

Mistakes happen. Fortunately, I dislike the Institute for Family Studies (see the whole series under this tag), and so I read it and pointed out this problem within a couple hours (first on Twitter, less than two hours after Wang tweeted it). It’s a social media post-publication peer review success story! If they correct it.

4 Comments

Filed under Research reports

For social relationships outside marriage

Stephanie Coontz has a great piece in tomorrow’s New York Times titled, “For a Better Marriage, Act Like a Single Person.” From her intro:

Especially around Valentine’s Day, it’s easy to find advice about sustaining a successful marriage, with suggestions for “date nights” and romantic dinners for two. But as we spend more and more of our lives outside marriage, it’s equally important to cultivate the skills of successful singlehood. And doing that doesn’t benefit just people who never marry. It can also make for more satisfying marriages.

From there she develops the case with, as usual, a lot of the right research. Well worth a read.

Stephanie used two empirical bits from my work:

No matter how much Americans may value marriage, we now spend more time living single than ever before. In 1960, Americans were married for an average of 29 of the 37 years between the ages of 18 and 55. That’s almost 80 percent of what was then regarded as the prime of life. By 2015, the average had dropped to only 18 years.

In many ways, that’s good news for marriages and married people. Contrary to some claims, marrying at an older age generally lowers the risk of divorce. It also gives people time to acquire educational and financial assets, as well as develop a broad range of skills — from cooking to household repairs to financial management — that will stand them in good stead for the rest of their lives, including when a partner is unavailable.

The first figure, the average years spent in marriage between the ages of 18 and 55 is very easy to calculate. You just sum the proportion of people married at each age. Here’s what it looks like, comparing 1960 (from the decennial Census) and 2015 (from the American Community Survey), both from IPUMS.org (click to enlarge):

YearsMarried

I think it’s a nice, simple way to show the declining footprint of marriage in American life. (I first did this, and described in the rationale, in 2010.)

The bit about older age at marriage being associated with lower odds of divorce is from this post. Here’s the result, showing odds of divorce in one year by age at marriage, with controls for duration, education, race/ethnicity, and nativity, for women in their first marriages (click to enlarge):
Divorce by age at marriage

There’s more discussion in the post, as well as in this followup post, which has this cool figure, where red is the highest odds of divorce and green is the lowest, and the axes are years married and age at marriage (click to enlarge):

Divorce By Age And Duration


My new book is out! Enduring Bonds: Inequality, Marriage, Parenting, and Everything Else That Makes Families Great and Terrible. Available all the usual places, plus here at the University of California Press, where Chapter 1 is available as a sample, and where instructors can request a review copy.

1 Comment

Filed under Research reports

Data analysis: Are older newlyweds saving marriage?

COS open data badgeCOS Open Materials badge


Is the “institution” still in decline if the incidence of marriage rebounds, but only at older ages?

In my new book I’ve revisited old posts and produced this figure, which shows the refined marriage rate* from 1940 to 2015, with a discussion of possible futures:

f15

The crash scenario – showing marriage ending around 2050, is there to show where the 1950-2014 trajectory is headed (it’s also a warning against using linear extrapolation to predict the future). The rebound scenario is intended to show how unrealistic the “revive marriage culture” people are. The taper scenario emerges as the most likely alternative; in fact, it’s grown more likely since I first made the figure a few years ago, as you can see by the 2010-2014 jag.

So let’s consider the tapering scenario more substantively — what would it look like? One way to get a declining marriage rate is if marriage is increasingly delayed, even if it doesn’t become less common; people still marry, but later. (If everyone got married at age 99, we would have universal marriage and a very low refined marriage rate.) I give some evidence for this scenario here.

These trends are presented with minimal discussion; I’m not looking at race/ethnicity or social class, childbearing or the recession; I’m not discussing divorce and remarriage and cohabitation, and I’m not testing hypotheses. (This is a list of research suggestions!) To make the subject more enticing as a research topic (and for accountability), I’ve shared the Census data, Stata code, and spreadsheet file used to make this post in this OSF project. You can use anything there you want. You can also easily fork the project — that is, make a duplicate of its contents, which you then own, and take off on your own trajectory, by adding to or modifying them.

Trends

For some context, here is the trend in percentage of men and women ever married, by age, from 1960. (“Ever married” means currently married, separated, divorced, or widowed.) This clearly shows both life-course delay and lifetime decline, but delay is much more prominent, at least so far. Even now, almost 90% of people have been married by age 60 or so, while the marriage rates for people under 35 have plummeted.

evmar6016

People become ever-married when they get first-married. We measure ever-married prevalence from a survey question on current marital status, but first-marriage incidence requires a question like the American Community Survey asks, “In the past 12 months, did this person get married?” Because they also ask how many times each person has been married, you can calculate a first marriage rate with this ratio:

(once married & married in the past 12 months) / (never married + (once married & married in the past 12 months))

Until recently it hasn’t been easy to measure first-marriage across all ages; now that we have the ACS marital events data (since 2008) we can. This allows us to look at the timing of first marriage, which means we can use current age-specific first-marriage rates to project lifetime ever-married rates under current conditions.

Here are the first-marriage rates for men and women, by age. Each set of bars shows the trend from 2008 to 2016. The left side shows men, by age; the right side shows women, by age; the totals for men and women are in the middle. This shows that first-marriage rates have fallen for men and women under age 35, but increased for those over age 35. The total first-marriage rate has rebounded from the 2013 crater, but is still lower than 2008.

1stmarage

This is a short-range trend, 9 years. It could be recession-specific, with people delaying marriage because of hardships, or relationships falling apart under economic stress, and then hurrying to marry a few years later. But it also fits the long-term trend of delay over decline.

The overall rates for men and women show that the 2014-2016 rebound has not brought first-marriage rates back to their 2008 level. However, what about lifetime odds of marriage? The next figure uses women’s age-specific first-marriage rates to project lifetime odds of marriage for three years: 2008, the 2013 crater, and 2016. This shows, for example, that at 2008 rates 59% of women would have married by age 30, compared with 53% in both 2013 and 2016.

1stmarproj

The 2013 and 2016 lines diverge after age 30, and by age 65 the projected lifetime ever-married rates have fully recovered. This implies that marriage has been delayed, but not forgone (or denied).

Till now I’ve shown age and sex-specific rates, but haven’t addressed other things that might changed in the never-married population. Finally, I estimated logistic regressions predicting first-marriage among never married men and women. The models include race, Hispanic origin, nativity, education, and age. In addition to the year and age patterns above, the models show that all races have lower rates than Whites, Hispanics have lower rates than non-Hispanics, foreign-born people have higher rates (which explains the Hispanic result), and people with more education first-marry more (code and results in the OSF project).

To see whether changes in these other variables change the story, I used the regressions to estimate first-marriage rates at the overall mean of all variables. These show a significant rebound from the bottom, but not returning to 2008 levels, quite similar to the unadjusted trends above:

1stmaradj

This is all consistent with the taper scenario described at the top. Marriage delayed, which reduces the annual marriage rate, but with later marriage picking up much of the slack, so that the decline in lifetime marriage prevalence is modest.


* The refined marriage rate is the number of marriages as a fraction of unmarried people. This is more informative than the crude marriage rate (which the National Center for Health Statistics tracks), which is marriages as a fraction of the total population. In this post I use what I guess you would call an age-specific refined first-marriage rate, defined above.

1 Comment

Filed under Research reports