In this post I present the most comprehensive analysis ever reported of the gender of New York Times writers (I think), with a sample of almost 30,000 articles.
This subject has been in the news, with a good piece the other day by Liza Mundy — in the New York Times — who wrote on the media’s Woman Problem, prompted by the latest report from the Women’s Media Center. The WMC checked newspapers’ female byline representation from the last quarter of 2013, and found levels ranging from a low of 31% female at the NYT to a high of 46% at the Chicago Sun-Times. That’s a broad study that covers a lot of other media, and worth reading. But we can go deeper on the NYTimes. The WMC report, it appears (in full here), only focused on the A-section of each newspaper, with articles coded by topic according to unspecified criteria. Thanks to the awesome data collecting powers of my colleague Neal Caren, a sociology professor at UNC, we can do better.*
I started this project with a snap survey of the gender of writers on the front page of each section of NYTimes.com: result, 36% female from a sample of 164 writers. Then I followed the front page of the website for a month: result, 29% female from a sample of 421. For this, Neal gave me everything the NYTimes published online from October 23, 2013 to February 25, 2014 — a total of 29,880 items, including online-only and print items. After eliminating the 7,669 pieces that had no author listed (mostly wire stories), we tried to determine the gender of the the first author of each piece. To start, Neal gave me the gender for all first names that were more than 90% male or female in the Social Security name database in the years 1945-1970. That covered 97% of the total. For the remainder, I investigated the gender of all writers who had published 10 pieces or more during the period (attempting to find both images and gendered pronouns). That resolved all but 255 pieces, and left me with a sample of 21,440.** These are the results.
1. Women were the first author on 34% of the articles. This is a little higher than the WMC got with their A-section analysis, which is not surprising given the distribution of writers across sections.
2. Women wrote the majority of stories in five out of 21 major sections, from Fashion (52% women ), to Dining, Home, Travel, and Health (76% women). Those five sections account for 11% of the total.
3. Men wrote the majority of stories in the seven largest sections. Two sections were more than three-fourths male (Sports, 89%; and Opinion, 76%). U.S., World, and Business were between 66% and 73% male.
Here is the breakdown by section (click to enlarge):
Since we have all this text, we can go a little beyond the section headers served up by the NYTimes‘ API. What are men and women writing about? Using the words in the headlines, I compiled a list of those headline words with the biggest gender difference in rates of appearance. That is, I calculated the frequency of occurrence of each headline word, as a fraction of all headline words in female-authored versus male-authored stories.
For example, “Children” occurred 36 times in women’s headlines, and 24 times in men’s headlines. Since men used more than twice as many headline words as women, this produced a very big gender spread in favor of women for the word “Children.” On the other hand, women’s headlines had 10 instances of “Iran,” versus 85 for men. Repeating this comparison zillions of times, I generated these lists:
NYTimes headline words used disproportionately in stories by
Here is the same table arranged as a word cloud, with pink for women and blue for men (sue me), and the more disproportionate words larger (click to enlarge):
What does it mean?
It’s just one newspaper but it matters a lot. According to Alexa, NYTimes.com is the 34th most popular website in the U.S., and the 119th most popular in the world — and the most popular website of a printed newspaper in the U.S. In the JSTOR database of academic scholarship, “New York Times” appeared almost four-times more frequently than the next most-commonly mentioned newspaper, the Washington Post.
Research (including this paper I wrote with Matt Huffman and Jessica Pearlman) shows that women in charge, on average, produce better outcomes for women below them in the organizational hierarchy. Jill Abramson, the NYTimes‘ executive editor, is the 19th most powerful woman in the world, behind only Sheryl Sandberg and Oprah Winfrey among media executives on that list. She is aware of this issue, and proudly told the Women’s Media Center that she had reached the “significant milestone” of having a half-female news masthead (which is significant). So why are women underrepresented in such prominent sections? That’s not a rhetorical question; I’m really wondering how this happens. The NYTimes doesn’t even do as well as the national average: 41% of the 55,000 “News Analysts, Reporters and Correspondents” working full-time, year-round in 2012 were women.
Organizational research finds that large companies are less likely to discriminate against women, and we suspect three main reasons: greater visibility to the public, which may complain about bias; greater visibility to the government, which may enforce anti-discrimination laws; and greater use of formal personnel procedures, which limits managerial discretion and is supposed to weaken old-boy networks. Among writers, however, an informal, back-channel norm still apparently prevails — at least according to a recent essay by Ann Friedman. Maybe NYTimes‘ big-company, formalized practices apply more to departments other than those that select and hire writers.
Finally, I am sorry I’m not doing this for race/ethnicity. It’s just a much different project to do that, because the names don’t tell you the identities as well. If someone wants to figure out the race/ethnicity of NYTimes authors (such as someone, say, inside their HR department) and send it to me, I would love to analyze it.
* Neal has a series of tutorials on analyzing text as data, and he has posted some slides on how to do this with the NYT’s application programming interface (API).
** A couple other notes. This is a count of stories by the gender of their authors, not a count of authors. If men or women write more stories per person then this will differ from the gender composition of authors. So it’s not a workplace study but a content study. It asks: When you see something in the NYTimes, what is the chance it was written by a woman versus a man? I combined Sunday Review (which was small) with Opinion, since they have the same editor and are the same on Sundays. I combined Style (which was small) into Fashion, since they’re “Fashion and Style” in the paper. I combined T Mag (which was small) into T:Style, since they seem to be the same thing. Also, I coded Reed Abelson‘s articles as female because I know she’s a woman even though “Reed” is male more than 90% of the time.