2016 U.S. population pyramid, with Baby Boom

I’m finishing up revisions for the second edition of The Family, and that means it’s time to update the population pyramids.

Because it’s not so easy (for me) to find population by age and sex for single years of age for the current year, and because there is a little trick to making population pyramids in Excel, and because I’m happy to be nearing the end of the revision, I took a few minutes to make one to share.

The data for single year population estimates for July 1, 2016 are here, and more specifically in the file called NC-EST2016-AGESEX-RES.csv, here. To make the pyramid in Excel, you multiply one of the columns of data by -1 and then display the results as absolute values by setting the number to a custom format, like this: #,###;#,###. Then in the bar graph you set the two series to overlap 100%.*

In this figure I highlighted the Baby Boom so you can see the tsunami rolling into the 70s now. Unlike when I discuss cohorts previously, when I let it slide, here I actually adjusted this from what you would get applying the official Baby Boom years (1946-1964) with subtraction from 2016. That would give you ages 52 to 70, but the boom obviously starts ate age 69 and ends at age 51 here, so that’s what I highlighted. Maybe this has to do with the timing within years (nine months after the formal end of WWII would be May 2, 1946). Anyway, this is not the official Baby Boom, just the boom you see.

Click to enlarge:

2016 pop pyramid

* I put the data file, the Census Bureau description, and the Excel file on the Open Science Framework here: https://osf.io/qanre/.


Fertility trends and the myth of Millennials

The other day I showed trends in employment and marriage rates, and made the argument that the generational term “Millennial” and others are not useful: they are imposed before analyzing data and then trends are shoe-horned into the categories. When you look closely you see that the delineation of “generations” is arbitrary and usually wrong.

Here’s another example: fertility patterns. By the definition of “Millennial” used by Pew and others, the generation is supposed to have begun with those born after 1980. When you look at birth rates, however,  you see a dramatic disruption within that group, possibly triggered by the timing of the 2009 recession in their formative years.

I do this by using the American Community Survey, conducted annually from 2001 to 2015, which asks women if they have had a birth in the previous year. The samples are very large, with all the data points shown including at least 8,000 women and most including more than 60,000.

The figure below shows the birth rates by age for women across six five-year birth cohorts. The dots on each line mark the age at which the midpoint of each cohort reached 2009. The oldest three groups are supposed to be “Generation X.” The three youngest groups shown in yellow, blue, and green — those born 1980-84, 1985-89, and 1990-94 — are all Millennials according to the common myth. But look how their experience differs!

cohort birth rates ACS.xlsx

Most of the fertility effect on the recession was felt at young ages, as women postponed births. The oldest Millennial group was in their late twenties when the recession hit, and it appears their fertility was not dramatically affected. The 1985-89 group clearly took a big hit before rebounding. And the youngest group started their childbearing years under the burden of the economic crisis, and if that curve at 25 holds they will not recover. Within this arbitrarily-constructed “generation” is a great divergence of experience driven by the timing of the great recession within their early childbearing years.

You could collapse these these six arbitrary birth cohorts into two arbitrary “generations,” and you would see some of the difference I describe. I did that for you in the next figure, which is made from the same data. And you could make up some story about the character and personality of Millennials versus previous generations to fit that data, but you would be losing a lot of information to do that.

cohort birth rates ACS.xlsx

Of course, any categories reduce information — even single years of age — so that’s OK. The problem is when you treat the boundaries between categories as meaningful before you look at the data — in the absence of evidence that they are real with regard to the question at hand.

Two examples of why “Millennials” is wrong

When you make up “generation” labels for arbitrary groups based on year of birth, and start attributing personality traits, behaviors, and experiences to them as if they are an actual group, you add more noise than light to our understanding of social trends.

According to generation-guru Pew Research, “millennials” are born during the years 1981-1997. A Pew essay explaining the generations carefully explains that the divisions are arbitrary, and then proceeds to analyze data according to these divisions as if are already real. (In fact, in the one place the essay talks about differences within generations, with regard to political attitudes, it’s clear that there is no political consistency within them, as they have to differentiate between “early” and “late” members of each “generation.”)

Amazingly, despite countless media reports on these “generations,” especially millennials, in a 2015 Pew survey only 40% of people who are supposed to be millennials could pick the name out of a lineup — that is, asked, “These are some commonly used names for generations. Which of these, if any, do you consider yourself to be?”, and then given the generation names (silent, baby boom, X, millennial), 40% of people born after 1980 picked “millennial.”

“What do they know?” You’re saying. “Millennials.

Two examples

The generational labels we’re currently saddled with create false divisions between groups that aren’t really groups, and then obscure important variation within the groups that are arbitrarily lumped together. Here is just one example: the employment experience of young men around the 2009 recession.

In this figure, I’ve taken three birth cohorts: men born four years apart in 1983, 1987, and 1991 — all “millennials” by the Pew definition. Using data from the 2001-2015 American Community Surveys via IPUMS.org, the figure shows their employment rates by age, with 2009 marked for each, coming at age 26, 22, and 18 respectively.


Each group took a big hit, but their recoveries look pretty different, with the earlier cohort not recovered as of 2015, while the youngest 1991 group bounced up to surpass the employment rates of the 1987s by age 24. Timing matters. I reckon the year they hit that great recession matters more in their lives than the arbitrary lumping of them all together compared with some other older “generations.”

Next, marriage rates. Here I use the Current Population Survey and analyze the percentage of young adults married by year of birth for people ages 18-29. This is from a regression that controls for year of age and sex, so it can be interpreted as marriage rates for young adults (click to enlarge).


From the beginning of the Baby Boom generation to those born through 1987 (who turned 29 in 2016, the last year of CPS data), the marriage rate fell from 57% to 21%, or 36 percentage points. Most of that change, 22 points, occurred within the Baby Boom. The marriage experience of the “early” and “late” Baby Boomers is not comparable at all. The subsequent “generations” are also marked by continuously falling marriage rates, with no clear demarcation between the groups. (There is probably some fancy math someone could do to confirm that, with regard to marriage experience, group membership by these arbitrary criteria doesn’t tell you more than any other arbitrary grouping would.)

Anyway, there are lots of fascinating and important ways that birth cohort — or other cohort identifiers — matter in people’s lives. And we could learn more about them if we looked at the data before imposing the categories.


This word ‘generation,’ I do not think it means what you think it means

The people who make up these things drive me bananas.

NPR launched a new series on “millennials” yesterday, called “New Boom,” with this dramatic declaration: “There are more millennials in America right now than baby boomers — more than 80 million of us.”

The definition NPR gives for this generation is “people born between 1980 and 2000.” And it’s true there are more than 80 million of them. In fact, there are 91 million of them, according to the 2012 American Community Survey data you can get from IPUMS.org.* That’s OK, though, because there are only 76 million Baby Boomers, so the claim checks out.

But what’s a generation?

The Baby Boom was a demographic event. In 1946, after the end of World War II, the crude birth rate — the number of births per 1,000 population — jumped from 20.4 to 24.1, the biggest one-year change recorded in U.S. history. The birth rate didn’t fall back to its previous level until 1965. That’s why the Baby Boom went down in history as 1946 to 1964. Because that’s when it happened.

This figure shows the number of living people by birth year, and the crude birth rate recorded in each year, using the NPR definition of millennials (in red), compared with the baby boom (purple):


Even with population growth I reckon the people born in the years 1946-1964 might outnumber the self-promoting millennials if not for the weight of mortality pulling down the purple bars. But if the young NPR reporters want to brag about outnumbering a generation that is starting to lose its older members to old age (and who are, after all, their parents), then I guess the shoe fits.

The Baby Boom was not a generation. It was a cohort, “a group of people sharing a common demographic experience” (in this case birth during the same period). That demographic event happens to have lasted 18 years, which is unfortunate because that may have contributed to the tendency to declare “generations” of similar lengths.

The Pew Research people, who do lots of interesting work on social change that uses generational concepts, use these slightly different definitions for four generations: Silent Generation, born 1928-1945; the Baby Boom Generation, born 1946-1964; Generation X, born 1965-1980; Millennial Generation, born 1981 and later (Pew says “no chronological endpoint has been set for this group,” which is awkward because if they’re really still going, the oldest are 33 and they have children that are the same generation as themselves**). Ironic, isn’t it, that Pew constructs “Generation X” as the shortest of the four (some generation, a mere 16 years!) before declaring them “America’s neglected ‘middle child.’

Real generations rarely have starting and ending points on a population level. Populations usually just keeping having births every year in smooth patterns of increase or decrease without discrete edges, so generations overlap. Even in families it gets hard to nail down generations once you start moving horizontally; siblings born many years apart are in the same generation, but the cousins get all confused.

Meaningful cohorts, on the other hand, can be defined all over the place, such as: the people who graduated college during the Great Recession, people who introduced the Internet to their parents, and so on. These are not generations.

In 2010, when crisis was really in the air, I was on the NPR show The State of Things in North Carolina, discussing the Baby Boom (no audio online). After attempting to clarify the difference between a generation and a cohort, I offered this dramatic example of a cohort — people born in 1960 specifically:

So if you were born in 1960, graduated college in 1982, and entered the labor force in the middle of an awful recession, then managed to pull some kind of career together, got married and divorced, by the 90s it was time to be downsized already for the first time, you’re 40 in 2000, and it’s time for the dot-com bubble, you’re out of your job again, and here you are ready for your retirement, finally, you’ve been left in your own 401(k), having to put together your own pension, and of course now that’s in the tank and your house isn’t worth anything. So that insecurity and instability is really imprinted this group. We talk about the 60s, and civil rights and antiwar, and great music and everything, but that’s seeming like a long time ago now for people who are looking at retirement.

I don’t know if anyone actually had that experience, but it seems likely.

Anyway, if people really want to keep using these generation labels, and it seems unlikely to stop now given the marketing payoff from naming rights, than that’s the way it goes. But please don’t ask demographers to define them.


* This is a little different from the population estimates the Census Bureau produces, which are coded by age rather than year of birth. I use the ACS data because they report year of birth, and because it’s easier. The differences are very small.

** Thanks to Mo Willow for pointing this out.


