How broken is our system (hit me with that figure again edition)

Why do sociologists publish in academic journals? Sometimes it seems improbable that the main goal is sharing information and advancing scientific knowledge. Today’s example of our broken system, brought to my attention by Neal Caren, concerns three papers by Eran Shor, Arnout van de Rijt, Charles Ward, Aharon Blank-Gomel, and Steven Skiena (Shor et al).

May 13, 2016 update: Eran Shor has sent me a response, which I posted here.

In a paywalled 2013 paper in Journalism Studies, the team used an analysis of names appearing in newspapers to report the gender composition of people mentioned. They analyzed the New York Times back to 1880, and then a larger sample of 13 newspapers from 1982 through 2005. Here’s one of their figures:

shor1

The 2013 paper was a descriptive analysis, establishing that men are mentioned more than women over time.

In a paywalled 2014 article in Social Science Quarterly (SSQ) the team followed up. Except for a string-cite mention in the methods section, the second paper makes no reference to the first, giving no indication that the two are part of a developing project. They use this figure to motivate the analysis in the second paper, with no acknowledgment that it also appeared in the first:

shor2

Shor et al. 2014 asked,

How can we account for the consistency of these disparities? One possible factor that may explain at least some of these consistent gaps may be the political agendas and choices of specific newspapers.

Their hypothesis was:

H1: Newspapers that are typically classified as more liberal will exhibit a higher rate of female-subjects’ coverage than newspapers typically classified as conservative.

After analyzing the data, they concluded:

The proposition that liberal newspapers will be more likely to cover female subjects was not supported by our findings. In fact, we found a weak to moderate relationship between the two variables, but this relationship is in the opposite direction: Newspapers recognized (or ranked) as more “conservative” were more likely to cover female subjects than their more “liberal” counterparts, especially in articles reporting on sports.

They offered several caveats about this finding, including that the measure of political slant used is “somewhat crude.”

Clearly, much more work to be done. The next piece of the project was a 2015 article in American Sociological Review (which, as the featured article of the issue, was not paywalled by Sage). Again, without mentioning that the figure has been previously published, and with one passing reference to each of the previous papers, they motivated the analysis with the figure:

shor3

Besides not getting the figure in color, ASR readers for some reason also don’t get 1982 in the data. (The paper makes no mention of the difference in period covered, which makes sense because it never mentions any connection to the analysis in the previous paper). The ASR paper asks of this figure, “How can we account for the persistence of this disparity?”

By now I bet you’re thinking, “One way to account for this disparity is to consider the effects of political slant.” Good idea. In fact, in the depiction of the ASR paper, the rationale for this question has hardly changed at all since the SSQ paper. Here are the two passages justifying the question.

From SSQ:

Former anecdotal evidence on the relationship between newspapers’ political slant and their rate of female-subjects coverage has been inconclusive. … [describing studies by Potter (1985) and Adkins Covert and Wasburn (2007)]…

Notwithstanding these anecdotal findings, there are a number of reasons to believe that more conservative outlets would be less likely to cover female subjects and women’s issues compared with their more liberal counterparts. First, conservative media often view feminism and women’s rights issues in a relatively negative light (Baker Beck, 1998; Brescoll and LaFrance, 2004). Therefore, they may be less likely to devote coverage to these issues. Second, and related to the first point, conservative media may also be less likely to employ female reporters and female editors […]. Finally, conservative papers may be more likely to cover “hard” topics that are traditionally (that is, conservatively) considered to be more important or interesting, such as politics, business, and sports, and less likely to report on issues such as social welfare, education, or fashion, where according to research women have a stronger presence (Holland, 1998; Ross, 2007, 2009; Ross and Carter, 2011).

From ASR:

Some work suggests that conservative newspapers may cover women less (Potter 1985), but other studies report the opposite tendency (Adkins Covert and Wasburn 2007; Shor et al. 2014a).

Notwithstanding these inconclusive findings, there are several reasons to believe that more conservative outlets will be less likely to cover women and women’s issues compared with their more liberal counterparts. First, conservative media often view feminism and women’s issues in a relatively negative light (Baker Beck 1998; Brescoll and LaFrance 2004), making them potentially less likely to cover these issues. Second, and related to the first point, conservative media may also be less likely to employ female reporters and female editors. Finally, conservative papers may be more likely to cover “hard” topics that are traditionally considered more important or interesting, such as politics, business, and sports, rather than reporting on issues such as social welfare, education, or fashion, where women have a stronger presence.

Except for a passing mention among the “other studies,” there is no connection to the previous analysis. The ASR hypothesis is:

Conservative newspapers will dedicate a smaller portion of their coverage to females.

On this question in the ASR paper, they conclude:

our analysis shows no significant relationship between newspaper coverage patterns and … a newspaper’s political tendencies.

It looks to me like the SSQ and ASR they used the same data to test the same hypothesis (in addition to whatever else is new in the third paper). Given that they are using the same data, how they got from a “weak to moderate relationship” to “no significant relationship” seems important. Should we no longer rely on the previous analysis? Or do these two papers just go into the giant heap of studies in which “some say this, some say that”? What kind of way is this to figure out what’s going on?

Still love your system?

It’s fine to report the same findings in different venues and formats. It’s fine, that is, as long as it’s clear they’re not original in the subsequent tellings. (I personally have been known to regale my students, and family members, with the same stories over and over, but I try to remember to say, “Stop me if I already told you this one” first.)

I’m not judging Shor et al. for any particular violation of specific rules or norms. And I’m not judging the quality of the work overall. But I will just make the obvious observation that this way of presenting ongoing research is wasteful of resources, misleading to readers, and hinders the development of research.

  • Wasteful because reviewers, editors, and publishers, are essentially duplicating their efforts to try to figure out what is actually to be learned from these overlapping papers — and then to repackage and sell the duplicative information as new.
  • Misleading to readers because we now have “many studies” that show the same thing (or different things), without the clear acknowledgment that they use the same data.
  • And hindering research because of the wasteful delays and duplicative expenses involved in publishing research that should be clearly presented in cumulative, transparent fashion, in a timely way — which is what we need to move science forward.

Open science

When making (or hearing) arguments against open science as impractical or unreasonable, just weigh the wastefulness, misleadingness, and obstacles to science so prevalent in the current system against whatever advantages you think it holds. We can’t have a reasonable conversation about our publishing system based on the presumption that it’s working well now.

In an open science system researchers publish their work openly (and free) with open links between different parts of the project. For example, researchers might publish one good justification for a hypothesis, with several separate analyses testing it, making clear what’s different in each test. Reviewers and readers could see the whole series. Other researchers would have access to the materials necessary for replication and extension of the work. People are judged for hiring and promotion according to the actual quality and quantity of their work and the contribution it makes to advancing knowledge, rather than through arbitrary counts of “publications” in private, paywalled journals. (The non-profit Center for Open Science is building a system like this now, and offers a free Open Science Framework, “A scholarly commons to connect the entire research cycle.”)

There are challenges to building this new system, of course, but any assessment of those challenges needs to be clear-eyed about the ridiculousness of the system we’re working under now.

Previous related posts have covered very similar publications, the opposition to open access, journal self-citation practices, and one publication’s saga.

12 thoughts on “How broken is our system (hit me with that figure again edition)

  1. I don’t see why you are NOT raising questions about the ethics of the researchers pursuing this publishing strategy. I consider it self-plagiarism, and as deceitful in presenting the “new” article as not having been published before, which one needs to attest. I don’t think cheating on the system is something we ought to let folks get away with — at the very least shaming and possibly more stringent penalties are appropriate. This holds — in my view — not merely for reprinting figures or tables but inserting block quotes from previously published work. .

    Like

    1. I don’t disagree, Myra, but I don’t have enough information to draw firm conclusions – not having access to the whole record, etc – so limited myself to what can easily be observed.

      Like

  2. Myra and I have debated this before. I don’t think “self-plagiarism” is the right concept for this, but rather “academic dishonesty.”. People do have the right to repurpose their own intellectual property, and there really are only so many ways you can describe how a sample was selected or how a variable was measured or what the conclusions of a particular other scholar’s research article were. I think complaining about “self-plagiarism” in those sections of articles is ridiculous, especially when you recognize that close paraphrases are also “plagiarism”, What matters is whether “the same” results are being published in multiple places. Again, there are different issues at stake. One is who “owns” the work, which is about copyright assignment. Another is whether you are trying to get credit multiple times for the same work in an inappropriate manner. This is always contextual. Hopefully we are not to be accused of “self-plagiarizing” when we recycle our lectures from term to term. If so, we may as well all go home now.

    Instead, the academic dishonesty here is two-fold: (1) Getting more lines on a c.v. than a person would otherwise “deserve” and thus, possibly, gaining an unwarranted advance in some tenure or salary-setting process. This is comparable to a student trying to get credit for the same paper in two different classes. There are genuine disciplinary debates about whether the “minimal publishable unit” is how you OUGHT to publish or whether you OUGHT to write just one big good article with all your results. (2) Conning a journal into publishing something it would not have wanted to publish had it known the same work was already published elsewhere. And, in the process, perhaps crowding out other worthy papers that might have been published instead. A journal whose stated policy is that they don’t publish work that has already been published elsewhere is being cheated if this is not true. And a journal that finds its content essentially reprinted in another journal may have a copyright violation issue. But in some cases, journal #2 reaches a different audience than journal #1 and both journals might be happy having the content and not object to it being elsewhere.

    I’ve seen a lot of “twin” articles in AJS/ASR which were clearly produced at the same time, one sent to each journal, each one spinning the same basic project in slightly different ways. Neither was published at the time of submission . . .

    Like

    1. The one aspect of this you don’t mention is the other researchers and readers (including people outside the area, like journalists) who are supposed to be learning something from all these publications. This is the “many studies show” problem. All our debating about getting credit and filling up journal pages tends to overshadow that, which is after all supposed to be the point.

      The copied paragraph I produced here is not about variable definition or methods, or even theory — it’s a justification for a research question that is virtually unchanged by the work done by these very authors on this very question. A believe a reader trying to make sense of the findings needs to see an accurate description of the state of the research, which includes previous findings in the same research project.

      In addition to expecting accurate reporting from authors, and, obviously open access publishing, we would be well served by adopting the practice of research registration, such as that facilitated by the Open Science Framework. In their system (see: https://osf.io/faq/):

      “A registration is a frozen version of your project that can never be edited or deleted, but you can issue a withdrawal of it later, leaving behind basic metadata. When you create the registration, you have the option of either making it public immediately or making it private for up to four years through an embargo. A registration is useful for certifying what you did in a project in advance of data analysis, or for confirming the exact state of the project at important points of the lifecycle, such as manuscript submission or the onset of data collection.”

      Like

      1. Good point, yes. I’m not trying to defend the article proliferation and the publication of articles that are just minor variations on each other. I am just arguing that “self-plagiarism” is the wrong conceptual tool for complaining about this. I want to be able to debate when and why overlap in publications is acceptable or not without using the “plagiarism” frame, which I think obscures the issues that need to be debated.

        Basically, your concern about proliferating studies that really duplicate each other would be completely valid even if they go to the trouble to generate new graphs and rewrite the papers. It would also be a problem if the work had been done by different researchers. You are concerned about there seeming to be 3 “studies” when there is really only 1. That is an important issue. Addressing that problem involves some way of tallying how many studies are using exactly the same data and measures in assessing the state of an empirical literature.

        Let me be clear: I also share your moral outrage at publishing multiple articles that are pretty much cut and paste versions of each other. I just want to be able to talk clearly about which problem is which.

        Plagiarism also is a thing. I happens often when people rip off the idea from a conference or working paper by a grad student and get the article into print while the grad student is still learning the publishing ropes. Or when someone steals a research project from a low-visibility journal and recycles it. Having a Sociolgy ArcX ethos would help that, I think.

        Like

  3. Given the author’s response, this whole debate seems ridiculous to me. Did no one really read the three articles in question? If you did, how can you talk about the “same results being published in multiple places”?

    To me, when looking a the three articles, it becomes very clear that there was no deception or ethical breaches and in fact the only charge that really stays is the one about having no reference to the graph. But this strikes me as quite technical and negligent, given that this graph is no more than the starting point of the latter two articles and not an old finding masquerading as a new one.

    Given that this is the case, I’m puzzled by Cohen’s clarification that he stands by what he writes here. In my reading, his post includes some pretty serious accusations of dishonesty (selling duplicative information as new and misleading readers), which I would like to see him retract (although the damage is probably already done).

    This looks to me like Cohen was trying to make an important point about open science. But he then picks a victim (and one that is probably not very suitable to make this point) and goes on an unsubstantiated (and largely false) attack against a specific research project. In the process he exaggerates minor issues just to get the point across. This doesn’t seem like a very collegial behaviour from an academic.

    Finally, there is one point in Shor’s reply that I find especially valid and troubling, and I do not understand why this is not at all addressed in the debate: The practice of making accusations of an ethical nature without approaching authors and asking them to present their own side. I find this very distasteful and I think Dr. Cohen owes the readers of this blog some clarifications about why he feels that he can do that when people’s good name is on the line.

    Like

Comments welcome (may be moderated)