Tag Archives: code

COVID-19 code, data, codebooks, figures

Every day for who knows how long I’ve tinkered with COVID-19 data and made graphs using Stata. Now I’ve condensed my tools down to several elements, updated daily, which I’m sharing:

  • A program that assembles the COVID death and case data, by date, at the county, state, and country level. To this I have added some population, income, and political variables. The program is here, along with the codebook it outputs.
  • The data file is here in Stata format and CSV format. It’s in long shape, so one record for each place on each date.
  • A Stata program that makes my favorite graphs right now (currently 24 per day). The Figures are stored here in PNG format.
  • The Stata scheme I use to make them look the how I like is here.

These files are linked to my laptop so they update automatically when I revise them. Yay, Open Science Framework, which is non-profit, open source, free to use, and deserves your support.

I hope someone finds these helpful, for teaching or exploring on their own. It’s all yours.

Here are a few figures from today’s runs (click to enlarge):

counties with any cases

deaths and GDP scatter


Filed under Me @ work

Equal-education and wife-more-education married couples don’t have sex less often

In my review of Mark Regnerus’s book, Cheap Sex, I wrote: “The book is an extended rant on the theme, ‘Why buy the cow when you can get the milk for free?’ wrapped in a misogynist theory about sexual exchange masquerading as economics, and motivated by the author’s misogynist religious and political views.”

Someone just reposted an old book-rehash essay of Regnerus’s called, “The Death of Eros.” In it he links to my post documenting the decline in sexual frequency among married couples in the General Social Survey. In marriage, Regnerus writes, “equality is the enemy of eros,” before selectively characterizing some research about the relationship between housework and sex. (Here’s a recent analysis finding egalitarian couples don’t have sex less.)

But I realized I never looked at sexual frequency in married couples by the relative education of the spouses, which is available in the GSS. So here’s a quick take: Married man-woman couples in which the wife has equal or more education don’t have sex less frequently.

I modeled sexual frequency (an interval scale from “not at all” = 0 to “4+ times per week” = 6 as a function of age, age-squared, respondent education, respondent sex, decade, and relative education (wife has lower degree, wife has same degree, wife has higher degree). The result is in this figure. Note the means are between 3 (“2-3 times per month”) and 4 (“weekly”). Stata code for GSS below.

death of eros

OK, that’s it. Here’s the code (I prettied the figure a little by hand afterwards):

*keep married people
keep if marital==1

* with non-missing own and spouse education
keep if spdeg<4 & degree<4
recode age (18/29=18) (30/39=30) (40/49=40) (50/59=50) (60/109=60), gen(agecat)
recode year (1970/1979=1970) (1980/1989=1980) (1990/1999=1990) (2000/2008=2000) (2010/2016=2010), gen(decade)
gen erosdead = spdeg>degree
gen equal=spdeg==degree

gen eros=0
replace eros=1 if spdeg<degree & sex==1
replace eros=2 if spdeg==degree
replace eros=3 if spdeg>degree & sex==1

replace eros=1 if spdeg>degree & sex==2
replace eros=3 if spdeg<degree & sex==2

label define de 1 "wife less"
label define de 2 "equal", add
label define de 3 "wife more", add
label values eros de

reg sexfreq i.sex i.agecat i.decade i.degree i.eros [weight=wtssall]
reg sexfreq i.sex c.age##c.age i.degree i.eros##i.decade [weight=wtssall]
margins i.eros##i.decade
marginsplot, recast(bar) by(decade)

Note: On 25 Dec 2018 I fixed a coding error and replaced the figure; the results are the same.


Filed under Me @ work, Research reports

That thing where you have a lot of little graphs (single-parent edition)

Yesterday I was on an author-meets-critics panel for The Triple Bind of Single-Parent Families: Resources, Employment, and Policies to Improve Well-Being, a new collection edited by Rense Nieuwenhuis and Laurie Moldonado. The book is excellent — and it’s available free under Creative Commons license.

Most of the chapters are comparative, with data from multiple countries. I like looking at the figures, especially the ones like this, which give a quick general sense and let you see anomalies and outliers. I made a couple, too, which I share below, with code.


Here’s an example, showing the proportion of new births to mothers who aren’t married, by education, for U.S. states.  For this I used the 2012-2016 combined American Community Survey file, which I got from IPUMS.org. I created an sample extract that included only women who reported having a child in the previous year, which gives me about 177,000 cases over the five years. The only other variables are state, education, and marital status. I put the raw data file on the Open Science Framework here. Code below.

My first attempt was bar graphs for each state. This is easiest because Stata lets you do graph means with the bar command (click to enlarge).

marst fertyr educ by state

The code for this is very simple. I made a dummy variable for single, so the mean of that is the proportion single. Edcat is a four-category education variable.

gr bar (mean) single [weight=perwt], over(edcat) bar(1,color(green)) yti(“Proportion not married”) by(state)

The bar graph is easy, and good for scanning the data for weird cases or interesting stories. But maybe it isn’t ideal for presentation, because the bars run from one state to the next. Maybe little lines would be better. This takes another step, because it requires making the graph with twoway, which doesn’t want to calculate means on the fly. So I do a collapse to shrink the dataset down to just means of single by state and edcat.

collapse (mean) single psingle=single [fw=perwt], by(state edcat)

Then I use a scatter graph, with line connectors between the dots. I like this better:

marst fertyr educ by state lines

You can see the overall levels (e.g., high in DC, low in Utah) as well as the different slopes (flatter in New York, steeper in South Dakota), and it’s still clear that the single-mother incidence is lowest in every state for women with BA degrees.

Here’s the code for that graph. Note the weights are now baked into the means so I don’t need them in the graph command. And to add the labels to the scatter plot you have to specify you want that. Still very simple:

gr twoway scatter single edcat , xlab(1 2 3 4, valuelabel) yti(“Proportion not married”) lcolor(green) msymbol(O) connect(l) by(state)

Sadly, I can’t figure out how to put one title and footnote on the graph, rather than a tiny title and footnote on every state graph, so I left titles out of the code and I then added them by hand in the graph editor. Boo.

Here’s the full code:

set more off

quietly infix ///
 byte statefip 1-2 ///
 double perwt 3-12 ///
 byte marst 13-13 ///
 byte fertyr 14-14 ///
 byte educ 15-16 ///
 int educd 17-19 ///
 using "[PATHNAME]\usa_00366.dat"

/* the sample is all women who reported having a child in the previous year, FERTYR==2 */
replace perwt = perwt / 100

format perwt %10.2f

label var statefip "State (FIPS code)"
label var perwt "Person weight"
label var marst "Marital status"
label var educd "Educational attainment [detailed version]"

label define statefip_lbl 01 "Alabama"
label define statefip_lbl 02 "Alaska", add
label define statefip_lbl 04 "Arizona", add
label define statefip_lbl 05 "Arkansas", add
label define statefip_lbl 06 "California", add
label define statefip_lbl 08 "Colorado", add
label define statefip_lbl 09 "Connecticut", add
label define statefip_lbl 10 "Delaware", add
label define statefip_lbl 11 "District of Columbia", add
label define statefip_lbl 12 "Florida", add
label define statefip_lbl 13 "Georgia", add
label define statefip_lbl 15 "Hawaii", add
label define statefip_lbl 16 "Idaho", add
label define statefip_lbl 17 "Illinois", add
label define statefip_lbl 18 "Indiana", add
label define statefip_lbl 19 "Iowa", add
label define statefip_lbl 20 "Kansas", add
label define statefip_lbl 21 "Kentucky", add
label define statefip_lbl 22 "Louisiana", add
label define statefip_lbl 23 "Maine", add
label define statefip_lbl 24 "Maryland", add
label define statefip_lbl 25 "Massachusetts", add
label define statefip_lbl 26 "Michigan", add
label define statefip_lbl 27 "Minnesota", add
label define statefip_lbl 28 "Mississippi", add
label define statefip_lbl 29 "Missouri", add
label define statefip_lbl 30 "Montana", add
label define statefip_lbl 31 "Nebraska", add
label define statefip_lbl 32 "Nevada", add
label define statefip_lbl 33 "New Hampshire", add
label define statefip_lbl 34 "New Jersey", add
label define statefip_lbl 35 "New Mexico", add
label define statefip_lbl 36 "New York", add
label define statefip_lbl 37 "North Carolina", add
label define statefip_lbl 38 "North Dakota", add
label define statefip_lbl 39 "Ohio", add
label define statefip_lbl 40 "Oklahoma", add
label define statefip_lbl 41 "Oregon", add
label define statefip_lbl 42 "Pennsylvania", add
label define statefip_lbl 44 "Rhode Island", add
label define statefip_lbl 45 "South Carolina", add
label define statefip_lbl 46 "South Dakota", add
label define statefip_lbl 47 "Tennessee", add
label define statefip_lbl 48 "Texas", add
label define statefip_lbl 49 "Utah", add
label define statefip_lbl 50 "Vermont", add
label define statefip_lbl 51 "Virginia", add
label define statefip_lbl 53 "Washington", add
label define statefip_lbl 54 "West Virginia", add
label define statefip_lbl 55 "Wisconsin", add
label define statefip_lbl 56 "Wyoming", add
label define statefip_lbl 61 "Maine-New Hampshire-Vermont", add
label define statefip_lbl 62 "Massachusetts-Rhode Island", add
label define statefip_lbl 63 "Minnesota-Iowa-Missouri-Kansas-Nebraska-S.Dakota-N.Dakota", add
label define statefip_lbl 64 "Maryland-Delaware", add
label define statefip_lbl 65 "Montana-Idaho-Wyoming", add
label define statefip_lbl 66 "Utah-Nevada", add
label define statefip_lbl 67 "Arizona-New Mexico", add
label define statefip_lbl 68 "Alaska-Hawaii", add
label define statefip_lbl 72 "Puerto Rico", add
label define statefip_lbl 97 "Military/Mil. Reservation", add
label define statefip_lbl 99 "State not identified", add
label values statefip statefip_lbl

label define educd_lbl 000 "N/A or no schooling"
label define educd_lbl 001 "N/A", add
label define educd_lbl 002 "No schooling completed", add
label define educd_lbl 010 "Nursery school to grade 4", add
label define educd_lbl 011 "Nursery school, preschool", add
label define educd_lbl 012 "Kindergarten", add
label define educd_lbl 013 "Grade 1, 2, 3, or 4", add
label define educd_lbl 014 "Grade 1", add
label define educd_lbl 015 "Grade 2", add
label define educd_lbl 016 "Grade 3", add
label define educd_lbl 017 "Grade 4", add
label define educd_lbl 020 "Grade 5, 6, 7, or 8", add
label define educd_lbl 021 "Grade 5 or 6", add
label define educd_lbl 022 "Grade 5", add
label define educd_lbl 023 "Grade 6", add
label define educd_lbl 024 "Grade 7 or 8", add
label define educd_lbl 025 "Grade 7", add
label define educd_lbl 026 "Grade 8", add
label define educd_lbl 030 "Grade 9", add
label define educd_lbl 040 "Grade 10", add
label define educd_lbl 050 "Grade 11", add
label define educd_lbl 060 "Grade 12", add
label define educd_lbl 061 "12th grade, no diploma", add
label define educd_lbl 062 "High school graduate or GED", add
label define educd_lbl 063 "Regular high school diploma", add
label define educd_lbl 064 "GED or alternative credential", add
label define educd_lbl 065 "Some college, but less than 1 year", add
label define educd_lbl 070 "1 year of college", add
label define educd_lbl 071 "1 or more years of college credit, no degree", add
label define educd_lbl 080 "2 years of college", add
label define educd_lbl 081 "Associates degree, type not specified", add
label define educd_lbl 082 "Associates degree, occupational program", add
label define educd_lbl 083 "Associates degree, academic program", add
label define educd_lbl 090 "3 years of college", add
label define educd_lbl 100 "4 years of college", add
label define educd_lbl 101 "Bachelors degree", add
label define educd_lbl 110 "5+ years of college", add
label define educd_lbl 111 "6 years of college (6+ in 1960-1970)", add
label define educd_lbl 112 "7 years of college", add
label define educd_lbl 113 "8+ years of college", add
label define educd_lbl 114 "Masters degree", add
label define educd_lbl 115 "Professional degree beyond a bachelors degree", add
label define educd_lbl 116 "Doctoral degree", add
label define educd_lbl 999 "Missing", add
label values educd educd_lbl

recode educd (0/61=1) (62/64=2) (65/90=3) (101/116=4), gen(edcat)

label define edlbl 1 "<HS"
label define edlbl 2 "HS", add
label define edlbl 3 "SC", add
label define edlbl 4 "BA+", add
label values edcat edlbl

label define marst_lbl 1 "Married, spouse present"
label define marst_lbl 2 "Married, spouse absent", add
label define marst_lbl 3 "Separated", add
label define marst_lbl 4 "Divorced", add
label define marst_lbl 5 "Widowed", add
label define marst_lbl 6 "Never married/single", add
label values marst marst_lbl

gen married = marst==1 /* this is married spouse present */
gen single=marst>3 /* this is divorced, widowed, and never married */

gr bar (mean) single [weight=perwt], over(edcat) bar(1,color(green)) yti("Proportion not married") by(state)

collapse (mean) single psingle=single [fw=perwt], by(state edcat)

gr twoway scatter single edcat , xlab(1 2 3 4, valuelabel) yti("Proportion not married") lcolor(green) msymbol(O) connect(l) by(state)




Filed under Me @ work