Tag Archives: peer review

No paper, no news (#NoPaperNoNews)

nopapernonews

In the abstract, the missions of science and science reporting align. But in the market arena they both have incentives to cheat, stretch, and rush. Members of the two groups sometimes have joint interests in pumping up research findings. Reporters feel pressure to get scoops on cutting edge research, research that they want to appear important as well as true — so they may want to avoid a pack of whining, jealous tweed-wearers seen as more hindrance than help. And researchers (and their press offices) want to get splashy, positive coverage of their discoveries that isn’t bogged down by the objections of all those whining, jealous tweed-wearers either.

Despite some bad incentives, the alliance between good researchers and good reporters may be growing stronger these days, with the potential to help stem the daily tide of ridiculous stories. Partly due to social media interaction, it’s become easier for researchers to ping reporters directly about their research, or about a problem with a story; and it’s become easier for reporters to find and contact researchers to cover their work, and for comment or analysis of research they’re covering. The result is an increase in research reporting that is skeptical and exploratory rather than just exuberant or exaggerated. Some of this rapid interaction between experts researchers and expert reporters, in fact, operates as a layer of improved peer review, subjecting potentially important research to more extreme vetting at just the right moment.

Those of us in these relationships who want to do the right thing really do need each other. And one way to help is to encourage the development of prosocial norms and best practices. To that end, I think we should agree on a No Paper No News pact. Let’s pledge:

  • If you are a researcher, or university press office, and you want your research covered, free up the paper — and insist that news coverage link to it. Make the journal open a copy, or post a preprint somewhere like SocArXiv.
  • If you are a reporter or editor, and you want to cover new research, insist that the researcher, university, or journal, provide open access to its content — then link to it.
  • If you are a consumer of science or research reporting, and you want to evaluate news coverage, look for a clear link to an open access copy of the paper. If you don’t see one, flag it with the #NoPaperNoNews tag, and pressure the news/research collaborators to comply with this basic best practice.

This is not an extremist approach. I’m not saying we must require complete open access to all research (something I would like to see, of course). And this is not dissing the peer review process, which, although seriously flawed in its application, is basically a good idea. But peer review is nothing like a guarantee that research is good, and it’s even less a guarantee that research as translated through a news release and then a reporter and an editor is reliable and responsible. #NoPaperNoNews recognizes that when research enters the public arena through the news media, it may become important in unanticipated ways, and it may be subject to more irresponsible uses, misunderstandings, and exploitation. Providing direct access to the research product itself makes it possible for concerned people to get involved and speak up if something is going wrong. It also enhances the positive impact of the research reporting, which is great when the research is good.

Plenty of reporters, editors, researchers, and universities practice some version of this, but it’s inconsistent. For example, the American Sociological Association currently has a news release up about a paper in the American Sociological Review, by Paula England,  Jonathan Bearak, Michelle Budig, and Melissa Hodges. And, as is now usually the case, that paper was selected by the ASR editors to be the freebie of the month, so it’s freely available. But the news release (which also only lists England as an author) doesn’t link to the paper. Some news reports link to the free copy but some don’t. ASA could easily add boilerplate language to their news releases, firmly suggesting that coverage link to the original paper, which is freely available.

Some publishers support this kind of approach, laying out free copies of breaking news research. But some don’t. In those cases, reporters and researchers can work together to make preprint versions available. In the social sciences, you can easily and immediately put a preprint on SocArXiv and add the link to the news report (to see which version you are free to post — pre-review, post-review, pre-edit, post-edit, etc. — consult your author agreement or look up the journal in the Sherpa/Romeo database.)

This practice is easy to enforce because it’s simple and technologically easy. When a New York Times reporter says, “I’d love to cover this research. Just tell me where I can link to the paper,” most researchers, universities, and publishers will jump to accommodate them. The only people who will want to block it are bad actors: people who don’t want their research scrutinized, reporters who don’t want to be double-checked, publishers who prioritize income over the public good.

#NoPaperNoNews

2 Comments

Filed under In the news

’16 and Pregnant’ and less so

3419870216_fded1624d2_z

From Flickr/CC: https://flic.kr/p/6dcJgA

Regular readers know I have objections to the framing of teen pregnancy, as a thing generally and as a problem specifically, separate from the rising age at childbearing generally (see also, or follow the teen births tag).

In this debate, one economic analysis of the effect of the popular MTV show 16 and Pregnant has played an outsized role. Melissa Kearney and Phillip Levine showed that was more decline in teen births in places where the show was popular, and attempted to establish that the relationship was causal — that the show makes people under age 20 want to have babies less. As Kearney put it in a video promoting the study: “the portrayal of teen pregnancy, and teen childbearing, is something they took as a cautionary tale.” (The paper also showed spikes in Twitter and Google activity related to birth control after the show aired.)

This was very big news for the marriage promotion people, because it was taken as evidence that cultural intervention “works” to affect family behavior — which really matters because so far they’ve spent $1 billion+ in welfare money on promoting marriage, with no effect (none), and they want more money.

The 16 and Pregnant paper has been cited to support statements such as:

  • Brad Wilcox: “Campaigns against smoking and teenage and unintended pregnancy have demonstrated that sustained efforts to change behavior can work.”
  • Washington Post: “By working with Hollywood to develop smart story lines on popular shows such as MTV’s ’16 and Pregnant’ and using innovative videos and social media to change norms, the [National Campaign to Prevent Teen and Unplanned Pregnancy] has helped teen pregnancy rates drop by nearly 60 percent since 1991.”
  • Boston Globe: “As evidence of his optimism, [Brad] Wilcox points to teen pregnancy, which has dropped by more than 50 percent since the early 1990s. ‘Most people assumed you couldn’t do much around something related to sex and pregnancy and parenthood,’ he said. ‘Then a consensus emerged across right and left, and that consensus was supported by public policy and social norms. . . . We were able to move the dial.’ A 2014 paper found that the popular MTV reality show ’16 and Pregnant’ alone was responsible for a 5.7 percent decline in teen pregnancy in the 18 months after its debut.”

I think a higher age at first birth is better for women overall, health permitting, but I don’t support that as a policy goal in the U.S. now, although I expect it would be an outcome of things I do support, like better health, education, and job opportunities for people of color and people who are poor.

Anyway, this is all just preamble to a new debate from a reanalysis and critique of the 16 and Pregnant paper. I haven’t worked through it enough to reach my own conclusions, and I’d like to hear from others who have. So I’m just sharing the links in sequence.

The initial paper, posted as a (non-peer reviewed) NBER Working Paper in 2014:

Media Influences on Social Outcomes: The Impact of MTV’s 16 and Pregnant on Teen Childbearing, by Melissa S. Kearney, Phillip B. Levine

This paper explores how specific media images affect adolescent attitudes and outcomes. The specific context examined is the widely viewed MTV franchise, 16 and Pregnant, a series of reality TV shows including the Teen Mom sequels, which follow the lives of pregnant teenagers during the end of their pregnancy and early days of motherhood. We investigate whether the show influenced teens’ interest in contraceptive use or abortion, and whether it ultimately altered teen childbearing outcomes. We use data from Google Trends and Twitter to document changes in searches and tweets resulting from the show, Nielsen ratings data to capture geographic variation in viewership, and Vital Statistics birth data to measure changes in teen birth rates. We find that 16 and Pregnant led to more searches and tweets regarding birth control and abortion, and ultimately led to a 5.7 percent reduction in teen births in the 18 months following its introduction. This accounts for around one-third of the overall decline in teen births in the United States during that period.

A revised version, with the same title but slightly different results, was then published in the top-ranked American Economic Review, which is peer-reviewed:

This paper explores the impact of the introduction of the widely viewed MTV reality show 16 and Pregnant on teen childbearing. Our main analysis relates geographic variation in changes in teen childbearing rates to viewership of the show. We implement an instrumental variables (IV) strategy using local area MTV ratings data from a pre-period to predict local area 16 and Pregnant ratings. The results imply that this show led to a 4.3 percent reduction in teen births. An examination of Google Trends and Twitter data suggest that the show led to increased interest in contraceptive use and abortion.

Then last month David A. Jaeger, Theodore J. Joyce, and Robert Kaestner posted a critique on the Institute for the Study of Labor working paper series, which is not peer-reviewed:

Does Reality TV Induce Real Effects? On the Questionable Association Between 16 and Pregnant and Teenage Childbearing

We reassess recent and widely reported evidence that the MTV program 16 and Pregnant played a major role in reducing teen birth rates in the U.S. since it began broadcasting in 2009 (Kearney and Levine, American Economic Review 2015). We find Kearney and Levine’s identification strategy to be problematic. Through a series of placebo and other tests, we show that the exclusion restriction of their instrumental variables approach is not valid and find that the assumption of common trends in birth rates between low and high MTV-watching areas is not met. We also reassess Kearney and Levine’s evidence from social media and show that it is fragile and highly sensitive to the choice of included periods and to the use of weights. We conclude that Kearney and Levine’s results are uninformative about the effect of 16 and Pregnant on teen birth rates.

And now Kearney and Levine have posted their response on the same site:

Does Reality TV Induce Real Effects? A Response to Jaeger, Joyce, and Kaestner (2016)

This paper presents a response to Jaeger, Joyce, and Kaestner’s (JJK) recent critique (IZA Discussion Paper No. 10317) of our 2015 paper “Media Influences on Social Outcomes: The Impact of MTV’s 16 and Pregnant on Teen Childbearing.” In terms of replication, those authors are able to confirm every result in our paper. In terms of reassessment, the substance of their critique rests on the claim that the parallel trends assumption, necessary to attribute causation to our findings, is not satisfied. We present three main responses: (1) there is no evidence of a parallel trends assumption violation during our sample window of 2005 through 2010; (2) the finding of a false placebo test result during one particular earlier window of time does not invalidate the finding of a discrete break in trend at the time of the show’s introduction; (3) the results of our analysis are robust to virtually all alternative econometric specifications and sample windows that JJK consider. We conclude that this critique does not pose a serious threat to the interpretation of our 2015 findings. We maintain the position that our earlier paper is informative about the causal effect of 16 and Pregnant on teen birth rates.

So?

There are interesting methodological questions here. It’s hard to identify the effects of interventions that are swimming with the tide of change. In fact, the creation of the show, the show’s popularity, the campaign to end teen pregnancy, and the rising age at first birth may all be outcomes of the same general historical trend. So I’m not that invested in the answer to this question, though I am very interested.

There are also questions about the publication process, which I am very invested in. That’s why I work to promote a working paper culture among sociologists (through the SocArXiv project). The original paper was posted on a working paper site without peer review, but NBER is for economists who already are somebody, so that’s a kind of indirect screening. Then it was accepted in a top peer-reviewed journal (somewhat revised), but that was after it had received major attention and accolades, including a New York Times feature before the working paper was even released and a column devoted to it by Nicholas Kristof.

So is this a success story of working paper culture gone right — driving attention to good work faster, and then also drawing the benefits of peer review through the traditional publication process? (And now continuing with open debate on non-gated sites). Or is it a case of political hype driving attention inside and outside of the academy — the kind of thing that scares researchers and makes them want to retreat behind the slower, more process-laden research flow which they hope will protect them from exposure to embarrassment and protect the public from manipulation by the credulous news media. I think the process was okay even if we do conclude the paper wasn’t all it was made out to be. There were other reputational systems at work — faculty status, NBER membership, New York Times editors and sources — that may be as reliable as traditional peer review, which itself produces plenty of errors.

So, it’s an interesting situation — research methods, research implications, and research process.

2 Comments

Filed under Research reports

Perspective on sociology’s academic hierarchy and debate

2389844916_e9cc979eb9_o

Keep that gate. (Photo by Rob Nunn, https://flic.kr/p/4DbzCG)

It’s hard to describe the day I got my first acceptance to American Sociological Review. There was no social media back then so I have no record of my reaction, but I remember it as the day — actually, the moment, as the conditional acceptance slipped out of the fax machine — that I learned I was getting tenure, that I would have my dream job for the rest of my life, with a personal income in the top 10 percent of the country for a 9-month annual commitment. At that moment I was not inclined to dwell on the flaws in our publishing system, its arbitrary qualities, or the extreme status hierarchy it helps to construct.

In a recent year ASR considered more than 700 submitted articles and rejected 90% or more of them (depending on how you count). Although many people dispute the rationality of this distinction, publishing in our association’s flagship journal remains the most universally agreed-upon indicator of scholarship quality. And it is rare. I randomly sampled 50 full-time sociology faculty listed in the 2016 ASA Guide to Graduate Departments of Sociology (working in the U.S. and Canada), and found that 9, or 18%, had ever published a research article in ASR.

Not only is it rare, but publication in ASR is highly concentrated in high-status departments (and individuals). While many departments have no faculty that have published in ASR (I didn’t count these, but there are a lot), some departments are brimming with them. In my own, second-tier department, I count 16 out of 27 faculty with publications in ASR (59%), while at a top-tier, article-oriented department such as the University of North Carolina at Chapel Hill (where I used to work), 19 of the 25 regular faculty, or 76%, have published in ASR (many of them multiple times).

Without diminishing my own accomplishment (or that of my co-authors), or the privilege that got me here, I should be clear that I don’t think publication in high-status journals is a good way to identify and reward scholarly accomplishment and productivity. The reviews and publication decisions are too uneven (although obviously not completely uncorrelated with quality), and the limit on articles published is completely arbitrary in an era in which the print journal and its cost-determined page-limit is simply ridiculous.

We have a system that is hierarchical, exclusive, and often arbitrary — and the rewards it doles out are both large and highly concentrated.

I say all this to put in perspective the grief I have gotten for publicly criticizing an article published in ASR. In that post, I specifically did not invoke ethical violations or speculate on the motivations or non-public behavior of the authors, about whom I know nothing. I commented on the flaws in the product, not the process. And yet a number of academic critics responded vociferously to what they perceive as the threats this commentary posed to the academic careers and integrity of the authors whose work I discussed. Anonymous critics called my post “obnoxious, childish, time wasting, self promoting,” and urged sociologists to “shun” me. I have been accused of embarking on a “vigilante mission.” In private, a Jewish correspondent referred me to the injunction in Leviticus against malicious gossip in an implicit critique of my Jewish ethics.*

In the 2,500-word response I published on my site — immediately and unedited — I was accused of lacking “basic decency” for not giving the authors a chance to prepare a response before I posted the criticism on my blog. The “commonly accepted way” when “one scholar wishes to criticize the published work of another,” I was told, is to go through a process of submitting a “comment” to the journal that published the original work, which “solicits a response from the authors who are being criticized,” and it’s all published together, generally years later. (Never mind that journals have no obligation or particular inclination to publish such debates, as I have reported on previously, when ASR declined for reasons of “space” to publish a comment pointing out errors that were not disputed by the editors.)

This desire to maintain gatekeepers to police and moderate our discussion of public work is not only quaint, it is corrosive. Despite pointing out uncomfortable facts (which my rabbinical correspondent referred to as the “sin of true speech for wrongful purpose”), my criticism was polite, reasoned, with documentation — and within the bounds of what would be considered highly civil discourse in any arena other than academia, apparently. Why are the people whose intellectual work is most protected most afraid of intellectual criticism?

In Christian Smith’s book, The Sacred Project of American Sociology (reviewed here), which was terrible, he complains explicitly about the decline of academic civilization’s gatekeepers:

The Internet has created a whole new means by which the traditional double-blind peer-review system may be and already is in some ways, I believe, being undermined. I am referring here to the spate of new sociology blogs that have sprung up in recent years in which handfuls of sociologists publicly comment upon and often criticize published works in the discipline. The commentary published on these blogs operates outside of the gatekeeping systems of traditional peer review. All it takes to make that happen is for one or more scholars who want to amplify their opinions into the blogosphere to set up their own blogs and start writing.

Note he is complaining about people criticizing published work, yet believes such criticism undermines the blind peer-review system. This fear is not rational. The terror over public discussion and debate — perhaps especially among the high-status sociologists who happen to also be the current gatekeepers — probably goes a long way toward explaining our discipline’s pitiful response to the crisis of academic publishing. According to my (paywalled) edition of the Oxford English Dictionary, the definition of “publish” is “to make public.” And yet to hear these protests you would think the whisper of a public comment poses an existential threat to the very people who have built their entire profession around publishing (though, to be consistent, it’s mostly hidden from the public behind paywalls).

This same fear leads many academics to insist on anonymity even in normal civil debates over research and our profession. Of course there are risks, as there tend to be when people make important decisions about things that matter. But at some point, the fear of repression for expressing our views (which is legitimate in some rare circumstances) starts looking more like avoidance of the inconvenience or discomfort of having to stand behind our words. If academics are really going to lose their jobs for getting caught saying, “Hey, I think you were too harsh on that paper,” then we are definitely having the wrong argument.

“After all,” wrote Eran Shor, “this is not merely a matter of academic disagreements; people’s careers and reputations are at stake.” Of course, everyone wants to protect their reputation — and everyone’s reputation is always at stake. But let’s keep this in perspective. For those of us at or near the top of this prestige hierarchy — tenured faculty at research universities — damage to our reputations generally poses a threat only within a very narrow bound of extreme privilege. If my reputation were seriously damaged, I would certainly lose some of the perks of my job. But the penalty would also include a decline in students to advise, committees to serve on, and journals to edit — and no change in that lifetime job security with a top-10% salary for a 9-month commitment. Of course, for those of us whose research really is that important, anything that harms our ability to work in exactly the way that we want to has costs that simply cannot be measured. I wouldn’t know about that.

But if we want the high privilege of an academic career — and if we want a discipline that can survive under scrutiny from an increasingly impatient public and deepening market penetration — we’re going to have to be willing to defend it.

* I think if random Muslims have to denounce ISIS then Jews who cite Leviticus on morals should have to explain whether — despite the obvious ethical merit to some of those commands — they also support the killing of animals just because they have been raped by humans.

5 Comments

Filed under Uncategorized

Journal self-citation practices revealed

I have written a few times about problems with peer review and publishing.* My own experience subsequently led me to the problem of coercive self-citation, defined in one study as “a request from an editor to add more citations from the editor’s journal for reasons that were not based on content.” I asked readers to send me documentation of their experiences so we could air them out. This is the result.

Introduction

First let me mention a new editorial in the journal Research Policy about the practices editors use to inflate the Journal Impact Factors, a measure of citations that many people use to compare journal quality or prestige. One of those practices is coercive self-citation. The author of that editorial, Ben Martin, cites approvingly a statement signed by a group of management and organizational studies editors:

I will refrain from encouraging authors to cite my journal, or those of my colleagues, unless the papers suggested are pertinent to specific issues raised within the context of the review. In other words, it should never be a requirement to cite papers from a particular journal unless the work is directly relevant and germane to the scientific conversation of the paper itself. I acknowledge that any blanket request to cite a particular journal, as well as the suggestion of citations without a clear explanation of how the additions address a specific gap in the paper, is coercive and unethical.

So that’s the gist of the issue. However, it’s not that easy to define coercive self-citation. In fact, we’re not doing a very good job of policing journal ethics in general, basically relying on weak enforcement of informal community standards. I’m not an expert on norms, but it seems to me that when you have strong material interests — big corporations using journals to print money at will, people desperate for academic promotions and job security, etc. — and little public scrutiny, it’s hard to regulate unethical behavior informally through norms.

The clearest cases involve asking for self-citations (a) before final acceptance, for citations (b) within the last two years and (c) without substantive reason. But there is a lot short of that to object to as well. Martin suggests that, to answer whether a practice is ethical, we need to ask: “Would I, as editor, feel embarrassed if my activities came to light and would I therefore object if I was publicly named?” (Or, as my friend Matt Huffman used to say when the used-textbook buyers came around offering us cash for books we hadn’t paid for: how would it look in grainy hidden-camera footage?) I think that journal practices, which are generally very opaque, should be exposed to public view so that unethical or questionable practices can be held up to community standards.

Reports and responses

I received reports from about a dozen journals, but a few could not be verified or were too vague. These 10 were included under very broad criteria — I know that not everyone will agree that these practices are unethical, and I’m unsure where to draw the line myself. In each case below I asked the current editor if they would care to respond to the complaint, doing my best to give the editor enough information without exposing the identity of the informant.

Here in no particular order are the excerpts of correspondence from editors, with responses from the editors to me, if any. Some details, including dates, may have been changed to protect informants. I am grateful to the informants who wrote, and I urge anyone who knows, or thinks they know, who the informants are not to punish them for speaking up.

Journal of Social and Personal Relationships (2014-2015 period)

Congratulations on your manuscript “X” having been accepted for publication in Journal of Social and Personal Relationships. … your manuscript is now “in press” … The purpose of this message is to inform you of the production process and to clarify your role in the process …

IMPORTANT NOTICE:

As you update your manuscript:

1. CITATIONS – Remember to look for relevant and recent JSPR articles to cite. As you are probably aware, the ‘quality’ of a journal is increasingly defined by the “impact factor” reported in the Journal Citation Reports (from the Web of Science). The impact factor represents a ratio of the number of times that JSPR articles are cited divided by the number of JSPR articles published. Therefore, the 20XX ratings will focus (in part) on the number of times that JSPR articles published in 20XX and 20XX are cited during the 20XX publication year. So citing recent JSPR articles from 20XX and 20XX will improve our ranking on this particular ‘measure’ of quality (and, consequently, influence how others view the journal. Of course only cite those articles relevant to the point. You can find tables of contents for the past two years at…

Response from editor Geoff MacDonald:

Thanks for your email, and for bringing that to my attention. I agree that encouraging self-citation is inappropriate and I have just taken steps to make sure it won’t happen at JSPR again.

Sex Roles (2011-2013 period)

In addition to my own report, already posted, I received an identical report from another informant. The editor, Irene Frieze, wrote: “If possible, either in this section or later in the Introduction, note how your work builds on other studies published in our journal.”

Response from incoming editor Janice D. Yoder:

As outgoing editor of Psychology of Women Quarterly and as incoming editor of Sex Roles, I have not, and would not, as policy require that authors cite papers published in the journal to which they are submitting.

I have recommended, and likely will continue to recommend, papers to authors that I think may be relevant to their work, but without any requirement to cite those papers. I try to be clear that it is in this spirit of building on existing scholarship that I make these recommendations and to make the decision of whether or not to cite them up to the author. As an editor who has decision-making power, I know that my recommendations can be interpreted as requirements (or a wise path to follow for authors eager to publish) but I can say that I have not further pressured an author whose revision fails to cite a paper I recommended.

I also have referred to authors’ reference lists as a further indication that a paper’s content is not appropriate for the journal I edit. Although never the sole indicator and never based only on citations to the specific journal I edit, if a paper is framed without any reference to the existing literature across journals in the field then it is a sign to me that the authors should seek a different venue.

I value the concerns that have been raised here, and I certainly would be open to ideas to better guide my own practices.

European Sociological Review (2013)

In a decision letter notifying the author of a minor revise-and-resubmit, the editor wrote that the author had left out of the references some recent, unspecified, publications in ESR and elsewhere (also unspecified) and suggested the author update the references.

Response from editor Melinda Mills:

I welcome the debate about academic publishing in general, scrutiny of impact factors and specifically of editorial practices.  Given the importance of publishing in our profession, I find it surprising how little is actually known about the ‘black box’ processes within academic journals and I applaud the push for more transparency and scrutiny in general about the review and publication process.  Norms and practices in academic journals appear to be rapidly changing at the moment, with journals at the forefront of innovation taking radically different positions on editorial practices. The European Sociological Review (ESR) engages in rigorous peer review and most authors agree that it strengthens their work. But there are also new emerging models such as Sociological Science that give greater discretion to editors and focus on rapid publication. I agree with Cohen that this debate is necessary and would be beneficial to the field as a whole.

It is not a secret that the review and revision process can be a long (and winding) road, both at ESR and most sociology journals. If we go through the average timeline, it generally takes around 90 days for the first decision, followed by authors often taking up to six months to resubmit the revision. This is then often followed by a second (and sometimes third) round of reviews and revision, which in the end leaves us at ten to twelve months from original submission to acceptance. My own experience as an academic publishing on other journals is that it can regularly exceed one year. During the year under peer review and revisions, relevant articles have often been published.  Surprisingly, few authors actually update their references or take into account new literature that was published after the initial submission. Perhaps this is understandable, since authors have no incentive to implement any changes that are not directly requested by reviewers.

When there has been a particularly protracted peer review process, I sometimes remind authors to update their literature review and take into account more recent publications, not only in ESR but also elsewhere.  I believe that this benefits both authors, by giving them greater flexibility in revising their manuscripts, and readers, by providing them with more up-to-date articles.  To be clear, it is certainly not the policy of the journal to coerce authors to self-cite ESR or any other outlets.  It is vital to note that we have never rejected an article where the authors have not taken the advice or opportunity to update their references and this is not a formal policy of ESR or its Editors.  If authors feel that nothing has happened in their field of research in the last year that is their own prerogative.  As authors will note, with a good justification they can – and often do – refuse to make certain substantive revisions, which is a core fundament of academic freedom.

Perhaps a more crucial part of this debate is the use and prominence of journal impact factors themselves both within our discipline and how we compare to other disciplines. In many countries there is a move to use these metrics to distribute financing to Universities, increasing the stakes of these metrics. It is important to have some sort of metric gauge of the quality and impact of our publications and discipline. But we also know that different bibliometric tools have the tendency to produce different answers and that sociology fairs relatively worse in comparison to other disciplines. Conversely, leaving evaluation of research largely weighted by peer review can produce even more skewed interpretations if the peer evaluators do not represent an international view of the discipline. Metrics and internationally recognized peer reviewers would seem the most sensible mix.

Work and Occupations (2010-2011 period)

“I would like to accept your paper for publication on the condition that you address successfully reviewer X’s comments and the following:

2. The bibliography needs to be updated somewhat … . Consider citing, however critically, the following Work and Occupations articles on the italicized themes:

[concept: four W&O papers, three from the previous two years]

[concept: two W&O papers from the previous two years]

The current editor, Dan Cornfield, thanked me and chose not to respond for publication.

Sociological Forum (2014-2015 period)

I am pleased to inform you that your article … is going to press. …

In recent years, we published an article that is relevant to this essay and I would like to cite it here. I have worked it in as follows: [excerpt]

Most authors find this a helpful step as it links their work into an ongoing discourse, and thus, raises the visibility of their article.

Response from editor Karen Cerulo:

I have been editing Sociological Forum since 2007. I have processed close to 2500 submissions and have published close to 400 articles. During that time, I have never insisted that an author cite articles from our journal. However, during the production process–when an article has been accepted and I am preparing the manuscript for the publisher–I do sometimes point out to authors Sociological Forum pieces directly relevant to their article. I send authors the full citation along with a suggestion as to where the citation be discussed or noted. I also suggest changes to key words and article abstracts, My editorial board is fully aware of this strategy. We have discussed it at many of our editorial board meetings and I have received full support for this approach. I can say, unequivocally, that I do not insist that citations be added. And since the manuscripts are already accepted, there is no coercion involved. I think it is important that you note that on any blog post related to Sociological Forum

I cannot tell you how often an author sends me a cover letter with their submission telling me that Sociological Forum is the perfect journal for their research because of related ongoing dialogues in our pages. Yet, in many of these cases, the authors fail to reference the relevant dialogues via citations. Perhaps editors are most familiar with the debates and streams of thought currently unfolding in a journal. Thus, I believe it is my job as editor and my duty to both authors and the journal to suggest that authors consider making appropriate connections.

Unnamed journal (2014)

An article was desk-rejected — that is, rejected without being sent out for peer review — with only this explanation: “In light of the appropriateness of your manuscript for our journal, your manuscript has been denied publication in X.” When the author asked for more information, a journal staff member responded with possible reasons, including that the paper did not include any references to the articles in that journal. In my view the article was clearly within the subject area of the journal. I didn’t name the journal here because this wasn’t an official editor’s decision letter and the correspondence only suggested that might be the reason for the rejction.

Sociological Quarterly (2014-2015 period)

In a revise and resubmit decision letter:

Finally, as a favor to us, please take a few moments to review back issues of TSQ to make sure that you have cited any relevant previously published work from our journal. Since our ISI Impact Factor is determined by citations, we would like to make sure papers under consideration by the journal are referring to scholarship we have previously supported.

The current editors, Lisa Waldner and Betty Dobratz, have not yet responded.

Canadian Review of Sociology (2014-2015 period)

In a letter communicating acceptance conditional on minor changes, the editor asked the author to consider citing “additional Canadian Review of Sociology articles” to “help with the journal’s visibility.”

Response from current editor Rima Wilkes:

In the case you cite, the author got a fair review and received editorial comments at the final stages of correction. The request to add a few citations to the journal was not “coercive” because in no instance was it a condition of the paper either being reviewed or published.

Many authors are aware of, and make some attempt to cite the journal to which they are submitting prior to submission and specifically target those journals and to contribute to academic debate in them.

Major publications in the discipline, such as ASR, or academia more generally, such as Science, almost never publish articles that have no reference to debates in them.

Bigger journals are in the fortunate position of having authors submit articles that engage with debates in their own journal. Interestingly, the auto-citation patterns in those journals are seen as “natural” rather than “coerced”. Smaller journals are more likely to get submissions with no citations to that journal and this is the case for a large share of the articles that we receive.

Journals exist within a larger institutional structure that has certain demands. Perhaps the author who complained to you might want to reflect on what it says about their article and its potential future if they and other authors like them do not engage with their own work.

Social Science Research (2015)

At the end of a revise-and-resubmit memo, under “Comment from the Editor,” the author was asked to include “relevant citations from Social Science Research,” with none specified.

The current editor, Stephanie Moller, has not yet responded.

City & Community (2013)

In an acceptance letter, the author was asked to approve several changes made to the manuscript. One of the changes, made to make the paper more conversant with the “relevant literature,” added a sentence with several references, one or more of which were to City & Community papers not previously included.

One of the current co-editors, Sudhir Venkatesh, declined to comment because the correspondence occurred before the current editorial teams’ tenure began.

Discussion

The Journal Impact Factor (JIF) is an especially dysfunctional part of our status-obsessed scholarly communication system. Self-citation is only one issue, but it’s a substantial one. I looked at 116 journals classified as sociology in 2014 by Web of Science (which produces the JIF), excluding some misplaced and non-English journals. WoS helpfully also offers a list excluding self-citations, but normal JIF rankings do not make this exclusion. (I put the list here.) On average removing self-citations reduces the JIF by 14%. But there is a lot of variation. One would expect specialty journals to have high self-citation counts because the work they publish is closely related. Thus Armed Forces and Society has a 31% self-citation rate, as does Work & Occupations (25%). But others, like Gender & Society (13%) and Journal of Marriage and Family (15%) are not high. On the other hand, you would expect high-visibility journals to have high self-citation rates, if they publish better, more important work; but on this list the correlation between JIF and self-citation rate is -.25. Here is that relationship for the top 50 journals by JIF, with the top four by self-citation labeled (the three top-JIF journals at bottom-right are American Journal of Sociology, Annual Review of Sociology, and American Sociological Review).

journal stats.xlsx

The top four self-citers are low-JIF journals. Two of them are mentioned above, but I have no idea what role self-citation encouragement plays in that. There are other weird distortions in JIFs that may or may not be intentional. Consider the June 2015 issue of Sociological Forum, which includes a special section, “Commemorating the Fiftieth Anniversary of the Civil Rights Laws.” That issue, just a few months old, as of yesterday includes the 9 most-cited articles that the journal published in the last two years. In fact, these 9 pieces have all been cited 9 times, all by each other — and each article currently has the designation of “Highly Cited Paper” from Web of Science (with a little trophy icon). The December 2014 issue of the same journal also gave itself an immediate 24 self-citations for a special “forum” feature. I am not suggesting the journal runs these forum discussion features to pump up its JIF, and I have nothing bad to say about their content — what’s wrong with a symposium-style feature in which the authors respond to each other’s work? But these cases illustrate what’s wrong with using citation counts to rank journals. As Martin’s piece explains, the JIF is highly susceptible to manipulation beyond self-citation promotion, for example by tinkering with the pre-publication queue of online articles, publishing editorial review essays, and of course outright fraud.

Anyway, my opinion is that journal editors should never add or request additional citations without clearly stated substantive reasons related to the content of the research and unrelated to the journal in which they are published. I realize that reasonable people disagree about this — and I encourage readers to respond in the comments below. I also hope that any editor would be willing to publicly stand by their practices, and I urge editors and journal management to let authors and readers see what they’re doing as much as possible.

However, I also think our whole journal system is pretty irreparably broken, so I put limited stock in the idea of improving its operation. My preference is to (1) fire the commercial publishers, (2) make research publication open-access with a very low bar for publication; and (3) create an organized system of post-publication review to evaluate research quality, with (4) republishing or labeling by professional associations to promote what’s most important.

* Some relevant posts cover long review delays for little benefit; the problem of very similar publications; the harm to science done by arbitrary print-page limits; gender segregation in journal hierarchies; and how easy it is to fake data.

13 Comments

Filed under Uncategorized

Stop me before I fake again

In light of the news on social science fraud, I thought it was a good time to report on an experiment I did. I realize my results are startling, and I welcome the bright light of scrutiny that such findings might now attract.

The following information is fake.

An employee training program in a major city promises basic job skills and as well as job search assistance for people with a high school degree and no further education, ages 23-52 in 2012. Due to an unusual staffing practice, new applications were for a period in 2012 allocated at random to one of two caseworkers. One provided the basic services promised but nothing extra. The other embellished his services with extensive coaching on such “soft skills” as “mainstream” speech patterns, appropriate dress for the workplace, and a hard work ethic, among other elements. The program surveyed the participants in 2014 to see what their earnings were in the previous 12 months. The data provided to me does not include any information on response rates, or any information about those who did not respond. And it only includes participants who were employed at least part-time in 2014. Fortunately, the program also recorded which staff member each participant was assigned to.

Since this provides such an excellent opportunity for studying the effects of soft skills training, I think it’s worth publishing despite these obvious weaknesses. To help with the data collection and analysis, I got a grant from Big Neoliberal, a non-partisan foundation.

The data includes 1040 participants, 500 of whom had the bare-bones service and 540 of whom had the soft-skills add-on, which I refer to as the “treatment.” These are the descriptive statistics:

fake-descriptives

As you can see, the treatment group had higher earnings in 2014. The difference in logged annual earnings between the two groups is significant at p

fake-ols-results

As you can see in Model 1, the Black workers in 2014 earned significantly less than the White workers. This gap of .15 logged earnings points, or about 15%, is consistent with previous research on the race wage gap among high school graduates. Model 2 shows that the treatment training apparently was effective, raising earnings about 11%. However, The interactions in Model 3 confirm that the benefits of the treatment were concentrated among the Black workers. The non-Black workers did not receive a significant benefit, and the treatment effect among Black workers basically wiped out the race gap.

The effects are illustrated, with predicted probabilities, in this figure:

fake-marginsplot

Soft skills are awesome.

I have put the data file, in Stata format, here.

Discussion

What would you do if you saw this in a paper or at a conference? Would you suspect it was fake? Why or why not?

I confess I never seriously thought of faking a research study before. In my day coming up in sociology, people didn’t share code and datasets much (it was never compulsory). I always figured if someone was faking they were just changing the numbers on their tables to look better. I assumed this happens to some unknown, and unknowable, extent.

So when I heard about the Lacour & Green scandal, I thought whoever did it was tremendously clever. But when I looked into it more, I thought it was not such rocket science. So I gave it a try.

Details

I downloaded a sample of adults 25-54 from the 2014 ACS via IPUMS, with annual earnings, education, age, sex, race and Hispanic origin. I set the sample parameters to meet the conditions above, and then I applied the treatment, like this:

First, I randomly selected the treatment group:

gen temp = runiform()
gen treatment=0
replace treatment = 1 if temp >= .5
drop temp

Then I generated the basic effect, and the Black interaction effect:

gen effect = rnormal(.08,.05)
gen beffect = rnormal(.15,.05)

Starting with the logged wage variable, lnwage, I added the basic effect to all the treated subjects:

replace newlnwage = lnwage+effect if treatment==1

Then added the Black interaction effect to the treated Black subjects, and subtracted it from the non-treated ones.

replace newlnwage = newlnwage+beffect if (treatment==1 & black==1)
replace newlnwage = newlnwage-beffect if (treatment==0 & black==1)

This isn’t ideal, but when I just added the effect I didn’t have a significant Black deficit in the baseline model, so that seemed fishy.

That’s it. I spent about 20 minutes trying different parameters for the fake effects, trying to get them to seem reasonable. The whole thing took about an hour (not counting the write-up).

I put the complete fake files here: code, data.

Would I get caught for this? What are we going to do about this?

BUSTED UPDATE:

In the comments, ssgrad notices that if you exponentiate (unlog) the incomes, you get a funny list — some are binned at whole numbers, as you would expect from a survey of incomes, and some are random-looking and go out to multiple decimal places. For example, one person reports an even $25,000, and another supposedly reports $25251.37. This wouldn’t show up in the descriptive statistics, but is kind of obvious in a list. Here is a list of people with incomes between $20000 and $26000, broken down by race and treatment status. I rounded to whole numbers because even without the decimal points you can see that the only people who report normal incomes are non-Blacks in the non-treatment group. Busted!

fake-busted-tableSo, that only took a day — with a crowd-sourced team of thousands of social scientists poring over the replication file. Faith in the system restored?

12 Comments

Filed under In the news, Research reports

My rejection of the National Marriage Project’s “Before ‘I Do'”

All day today, “The Decisive Marriage” has topped the New York Times most-emailed list. The piece is a Well Blog post, written by Tara Parker-Pope, which reports on a report published by the National Marriage Project and written by Galena Rhoades and Scott Stanley, “Before ‘I Do’: What Do Premarital Experiences Have to Do with Marital Quality Among Today’s Young Adults?”

I have frequently criticized the National Marriage Project, run by Bradford Wilcox (posts listed under this tag), and I ignore their work when I can. But this report is getting a lot of attention now and several people have asked my opinion. Since the research in the report has not been subject to peer review, and the Pope piece does not include any expert commentary from non-authors, I figured I’d structure this post like the peer review report I would dash off if I had been asked to review the piece (it’s a little different because I have access to the author and funding information, and I wouldn’t include links or graphics, but this is more or less how it would go if I were asked to review it).

Before “I Do”

This paper reports results from an original data collection which sampled 1,294 people in 2007/08, and then followed an unknown number of them for five years. The present paper reports on the marriage quality of 418 of the individuals who reported marrying over the period (ages 18-40). The authors provide no information on sample attrition or how this was handled in the analysis, or the determinants of marriage within the sample. Although they claim (without evidence) that the sample was “reasonably representative of unmarried adults,” they note it is 65% female, so it’s obviously not representative. More importantly, the analysis sample is only those who married, which is highly select. Neither sexual orientation of the respondents, nor gender composition of the couples described is reported.

The outcome variable in the study is a reasonable measure of “marital quality” based on a four-item reduced-form version of the Dyadic Adjustment Scale (originally developed by Graham Spanier), which includes these items:

  • How often do you discuss or have you considered divorce, separation, or terminating your relationship?
  • In general, how often do you think that things between you and your partner are going well?
  • Do you confide in your mate?
  • Please circle the dot which best describes the degree of happiness, all things considered, of your relationship.

The authors provide no details on the coding of these items, but say the scale ranges from 0 to 21, and their sample included people who scored from 0 to 21. However, the mean was 16.5 and the standard deviation was 3.7, indicating a strong skew toward high scores. Inexplicably, for the presentation of results the authors dichotomize the dependent variable into those they classify as “higher quality,” the 40% of respondents who scored (19-21), versus everyone else (0-18). To defend this decision, the authors offer this non-explanation, which means exactly nothing:

This cut point was selected by inspection of the distribution. While it is somewhat arbitrary, we reasoned that these people are not just doing “above average” in their marriages, but are doing quite well.

The average marriage duration is not reported, but the maximum possible is 5 years, so we are talking about marriage quality very early in these marriages.

The main presentation of findings consists of bar graphs misleadingly labeled “Percent in Higher-Quality Marriages, by…” various independent variables. These are misleading because, according to the notes to these figures, “These percentages are adjusted for race/ethnicity, years of education, personal income, religiousness, and frequency of attendance at religious services.” Here is one:

stanleygraph

The method for arriving at these “adjusted” percentages is not given. This apparently confused Parker-Pope, who reported them as unadjusted percentages, like this:

People who lived with another person before marrying also reported a lower-quality relationship. In that group, 35 percent had higher-quality marriages. Among those who had not lived with another romantic partner before marriage, 42 percent had higher-quality marriages.

The statistical significance of this difference is not reported. However, if this were a simple difference of proportions, the difference would not be statistically significant at conventional levels (with a sample of 418, 39% of whom lived with someone else before, the test for difference of proportions for .42 and .35 yields a z-score of 1.43, p=.15). The full report includes an appendix which says they used multilevel modeling, but the form of the regression is not specified. The regression table provided includes no fit statistics or variance components so the efficacy of the model cannot be evaluated.

Regression says: Adding 100 people to the wedding party 5 times would not equal the effect on marital quality of not being Black.

Regression says: Adding 100 people to the wedding party 5 times would not equal the effect on marital quality of not being Black.

Much is made here (and in the Pope article about these findings) about the wedding-size effect. That is, among married couples, those who reported bigger weddings had higher average marriage quality. The mean wedding size was 117. In the regression model, each additional wedding guest was associated with an increase in marriage quality (on the 0-21 scale) of .005. That is, if this were a real effect, adding 100 wedding guests would increase marital quality by half a point, or less than 1/7 of a standard deviation. For comparison, in the model, the negative effect of being Black (-2.69) is more than 5-times greater than the effect of a 100-guest swing in wedding attendance. (The issue of effect size did not enter into Pope’s description of the results.)

The possibility of nonlinear effects of wedding size or other variables is not discussed.

Are the results plausible?

It is definitely possible that, for example, less complicated relationship histories, or larger weddings, do contribute to marital happiness early in the marriage. The authors speculate, based on psychological research from the 1970s, that the “desire for consistency” means “having more witnesses at a wedding may actually strengthen marital quality.”

Sure. The much bigger issue, however, is two kinds of selection. The first, which they address — very poorly — concerns spurious effects. Thus, the simplest explanation is that (holding income constant) people with larger weddings simply had better relationships to begin with. Or, because personal income (not couple income — and note only one person from each couple was interviewed) is at best a very noisy indicator of resources available to couples, big weddings may simply proxy for wealthier families.

Or, about the finding that living with someone else prior to the current relationship is associated with poorer marriage quality, it may simply be that people who have trouble in relationships are more likely to have both lived with someone else and have poor quality marriages later. Cherlin et al. have reported, for example, that women with a history of sexual abuse are more likely to be in transitory relationships, including serial cohabiting relationships, so a history of abuse could account for some of these results. And so on.

The authors address this philosophically, which is all they can do given their data:

One obvious objection to this study is that it may be capturing what social scientists call “selection effects” rather than a causal relationship between our independent variables and the outcome at hand. That is, this report’s results may reflect the fact that certain types of people are more likely to engage in certain behaviors—such as having a child prior to marriage—that are correlated with experiencing lower odds of marital quality. It could be that these underlying traits or experiences, rather than the behaviors we analyzed, explain the associations reported here. This objection applies to most research that is not based on randomized experiments. We cannot prove causal associations between the personal and couple factors we explore and marital quality.

However, because they have rudimentary demographic controls, and the independent variables chronologically precede the outcome variable, they think they’re on pretty firm ground:

With the help of our research, we hope current and future couples will better understand the factors that appear to contribute to building a healthy, loving marriage in contemporary America.

This is Wilcox’s standard way of nodding to selection before plowing ahead with unjustified conclusions. This is not a reasonable approach, for reasons apparent in today’s New York Times. Tara Parker-Pope does not mention this issue, and her piece will obviously reach many more people than the original report or this post.

They hope people will take their results as relationship advice. In Pope’s piece, Stanley offers exactly the same advice he always gives. If that is to be the case, the best advice by far — based on their models — is to avoid being Black, and to finish high school. Living with both one’s biological parents at age 14 helps, too. In relationship terms, unfortunately, most of the results could just as easily reflect wealth or initial relationship quality rather than relationship decisions, and thus tell us that people who have healthy (and less complicated) relationships before marriage have healthy relationships in the first few years after marriage.

Perhaps more serious, however, for this study design, is the second kind of selection: selection into the sample (by marriage). Anything that affects both the odds of marrying and the quality of marriage is potentially corrupting these results. This is a big, complicated issue, with a whole school of statistical methods attached to it. Unless they attend to that issue this analysis should not be published.

On the funding

The authors state the project was “initially funded” by the National Institute of Child Health and Human Development, but the report also acknowledges support from the William E. Simon Foundation, a very conservative foundation that in 2012 gave hundreds of thousands of dollars to the Witherspoon Institute (which funded the notorious Wilcox/Regnerus research on children of same-sex couples), the Heritage Foundation, the Hoover Institute, the Manhattan Institute, and other conservative and Christian activist organizations. Details on funding are not provided.

The National Marriage Project is well-known for publishing only work that supports their agenda of marriage promotion. Some of what they publish may be true, but based on their track record they cannot be trusted as honest brokers of new research.

5 Comments

Filed under In the news