Sociologists: Don’t embargo your dissertation

Your work is important. Don’t hide it. (PNC video)

This post is about the practice of putting your dissertation under an embargo, which means your university library, and probably its agent, ProQuest, don’t let people read it for a certain amount of time, sometimes only a few months, sometimes many years. At my school, the University of Maryland, the graduate school is implementing a new policy that allows two-year embargoes without special permission (down from six years), and longer embargoes only with permission of the advisor and the dean.

Are you in this to advance knowledge? If so, don’t embargo your dissertation. By definition, a dissertation is a contribution to knowledge. By definition, keeping people from reading it stops that from occurring.

Many PhD graduates embargo their dissertations because it feels like the safer thing to do, because they’re vaguely worried about sharing their work, either because it’s so good someone will steal it, or it’s so bad it will embarrass them — or, weirdly, both. Many people don’t seriously think about it, don’t read up on the question, don’t discuss it with knowledgeable mentors (which your PhD advisor is very likely not, at least when it comes to this question). Lots of good people make this mistake, and that’s a shame. I’m writing this post so that, if you see it before you face this choice, there’s a chance my nagging voice will get stuck in your head.

Some graduate students think they’re being exploited and someone is going to make money off their work. Probably not. (You may have been exploited as a graduate student, and you might have good reasons for disliking your university, but this isn’t about making your university happy.) Maybe your dissertation will lead to an important book that lots of people will read — that is wonderful, and I hope it does. Of course, that’s a very small minority of dissertations, even among really good ones that make important contributions to knowledge. That’s just not in the cards in the vast majority of cases. But unless you already have a contract and a publisher telling you that without an embargo the deal is off — a situation that is vanishingly rare if it occurs at all, at least in sociology — making your dissertation publicly available will not hurt (and will probably help) your chances of accomplishing that goal. And if you’re going to publish articles based on your dissertation, no reputable journal will turn them away because they have overlapping content with your dissertation.

Some graduate students are afraid they will get “scooped” or their ideas will be “stolen.” This is profoundly misguided. You are doing the work so that people will read it. People are going to do what they do. You might be taking a small risk to your personal interest by making your work public, but consider it against the benefit of people reading it (which is, after all, the reason you should have written it). This is your finished work. It’s done. By definition it can’t be scooped. It can be plagiarized, like anything else. Would it be awkward or disappointing if someone published something similar that made similar contributions? Maybe. Will that substantially harm your career or personal interests? Very unlikely.* If you had a good idea, it will probably lead to more. Your ideas and your efforts in the dissertation are on the record now. Be proud of them, take credit for them, encourage people to engage with them, and hope that they will be inspired to do work that follows your lead. If your dissertation is good, it’s worth the risk — because you want people to read it. If your dissertation is bad, there is no risk anyway.

Will making your dissertation public hurt your chances of publishing a book? Almost certainly not. As an editor at Harvard University Press wrote:

“Generally speaking, when we at HUP take on a young scholar’s first book, whether in history or other disciplines, we expect that the final product will be so broadened, deepened, reconsidered, and restructured that the availability of the dissertation is irrelevant.”

And they quoted an assistant editor who went further: making your dissertation available improves your chances of getting a book contract:

“I’m always looking out for exciting new scholarship that might make for a good book, whether in formally published journal articles and conference programs, or in the conversation on Twitter and in the history blogosphere, or in conversations with scholars I meet. And so, to whatever extent open access to a dissertation increases the odds of its ideas being read and discussed more widely, I tend to think it increases the odds of my hearing about them.”

Or, as the editorial director at Columbia University Press, Eric Schwartz wrote in a tweet about sharing dissertations: “No problem. Book and dissertation are for different audiences.”

Of course there may be exceptions. If you have an editor on the hook who insists on an embargo, consider the pros and cons. If you have only a vague hope of publishing it down the road, don’t bother.

Do you want to win awards so everyone is talking about your dissertation? Don’t embargo it. Thanks to a 2015 change in policy at the American Sociological Association:

“To be eligible for the ASA Dissertation Award, nominees’ dissertations must be publicly available in Dissertation Abstracts International or a comparable outlet. Dissertations that are not available in this fashion will not be considered for the award.”

There are real, important principles at stake. Hate on your universities all you want, but some of their lofty rhetoric is true and good — and we should be holding them to it, not scoffing at it. Many universities, like the University of California system, have policies based on such high-minded statements as this:

“The University of California is committed to disseminating research and scholarship conducted at the University as widely as possible…. The University affirms the long-standing tradition that theses and dissertations, which represent significant contributions to the advancement of knowledge and the scholarly record, should be shared with scholars in all disciplines and the general public.”

Embargoing the work for years absolutely violates the spirit of such a principled policy, even if they do allow an embargo. Making your work accessible years later is clearly depriving the public of “significant contributions to the advancement of knowledge and the scholarly record” for the most important period in the life of the work — the years right after it’s done.

Here’s the statement from the University of Chicago:

“The public sharing of original dissertation research is a principle to which the University is deeply committed, and dissertations should be made available to the scholarly community at the University of Chicago and elsewhere in a timely manner. If dissertation authors are concerned that making their research publicly available might endanger research subjects or themselves, jeopardize a pending patent, complicate publication of a revised dissertation, or otherwise be unadvisable, they may, in consultation with faculty in their field (and as appropriate, research collaborators), restrict access to their dissertation for a limited period of time.”

Some people might skim through this policy and say, “Oh, cool, they allow an embargo,” and just check the box requesting it. But that’s making a powerful statement against the important principle articulated in this policy. If you don’t have a really good reason to embargo your dissertation — and you almost certainly don’t — the public interest demands that you make it public. Take the value of your work seriously. Not it’s commercial value, it’s actual value — which is to people who want to read it.

There is also an important accountability principle at stake. Should PhDs be awarded in secret, with no accountability beyond the committee room walls, until years later? For those of us on the faculty, how are we to evaluate programs and their candidates if we can’t scrutinize their most important works? How can we claim to be reputable programs if we shroud our work behind embargoes. Without at least this bottom-line transparency, there can be little accountability.

I write this post out of a certain sense of shame. I’m the director of graduate studies in our department, and I haven’t made it a priority to talk to students about this, because I didn’t know it was happening. When I looked at the dissertations from our department, which are archived in the Digital Repository at the University of Maryland (or, if they are embargoed, merely listed), I saw that among the last 19 dissertations, 12 were currently embargoed. The seven that were made public have been downloaded 1,200 times.

If you want to embargo your dissertation, or if someone is telling you that you should, the burden is on you (or them) to prove that the real benefits of the embargo — not just for you, but for the contribution to knowledge that your work represents — are greater than the harm of denying readers access to your research. The default must be to share our dissertations, with rare exceptions only when real (not imagined or rumored) circumstances demand that the public interest in access to knowledge be sacrificed.


* My dissertation, completed in 1999, although excellent, was not especially original. My major contributions were updating research on a longstanding theory to (a) use more recent data, (b) include women, and (c) use hierarchical linear models. My dissertation was titled, “Black Population Size and the Structure of United States Labor Market Inequality.” In 1997, as I was hard at work, and had a chapter under review at Social Forces (which I had already presented at two conferences), an article appeared (in Social Forces!) titled, “Black Population Concentration and Black-White Inequality: Expanding the Consideration of Place and Space Effects.” The authors used (a) the new data I was using, they (b) included women, and their (c) models were fancier than mine. I was crushed. And then, with my advisor’s help, I got over it. My article (with a citation to theirs added) got published the next year anyway, titled, “Black Concentration Effects on Black-White and Gender Inequality: Multilevel Analysis for U.S. Metropolitan Areas.” People read both articles. And then I went on to do a bunch more work in that area, with great collaborators, building up a body of research that drew from my dissertation but went much further in terms of theory, methods, and data. My article got cited plenty, partly because it was part of a group of articles that traveled together. I was “scooped,” but they didn’t get their ideas from sneaking a look at my brilliant work in progress, they were logical next steps in a 40-year trajectory of research on an established set of questions. Their publication strengthened the field in which I was working. (In fact, if they had stolen my ideas their paper would have been worse for them, and less damaging to me.)

I spent my semester as an MIT / CREOS Visiting Scholar and it was excellent

PNC in Cambridge in the fall.
Cambridge in the fall.

As a faculty sociologist who works in the area of family demography and inequality, my interest in open scholarship falls into the category of “service” among my academic obligations, essentially unrecognized and unremunerated by my employer, and competing with research and teaching responsibilities for my time. In that capacity I founded SocArXiv in 2016 (supported by several small grants) and serve as its director, organized two conferences at the University of Maryland under the title O3S: Open Scholarship for the Social Sciences, and I was elected to the Committee on Publications of the American Sociological Association. While continuing that work during a sabbatical leave, I was extremely fortunate to land a half-time position as visiting scholar at the MIT Libraries in the fall 2018, which helped me integrate that service agenda with an emerging research agenda around scholarly communication.

The position was sponsored by a group of libraries organized by the Association of Research Libraries — MIT, UCLA, the University of Arizona, Ohio State University, and the University of Pittsburgh — and hosted by the new Center for Research on Equitable and Open Scholarship (CREOS) at MIT. My principal collaborator has been Micah Altman, the director of research at CREOS.

The semester was framed by the MIT Grand Challenges Summit in the spring, which I attended, and the report that emerged from that meeting: A Grand Challenges-Based Research Agenda for Scholarly Communication and Information Science, on which I was a collaborator. The report, published in December, describes a vision for a more inclusive, open, equitable, and sustainable future for scholarship; it also characterizes the barriers to this future, and identifies the research needed to bring it to fruition.

Sociology and SocArXiv

Furthering my commitments to sociology and SocArXiv, I continued to work on the service. SocArXiv is growing, with increased participation in sociology and other social sciences. In the fall the Center for Open Science, our host, opened discussions with its paper serving communities about weaning the system off its core foundation financial support and using contributions from each service to make it sustainable (thus far have not paid COS for its develop and hosting). This was an expected challenge, which will require some creative and difficult work in the coming months.

Finally, at the start of the semester I noted that most sociologists — even those interested in open access issues — were not familiar with current patterns, trends, and debates in the scholarly communications ecosystem. This has hampered our efforts to build SocArXiv, as well as our ability to press our associations and institutions for policy changes in the direction of openness, equity, and sustainability. In response to this need, especially among graduate students and junior scholars, I drafted a scholarly communication primer for sociology, which reviews major scholarly communication media, policies, economic actors, and recent innovations. I posted a long draft (~13,000 words) for comment in January, and received a very positive response. It appears that a number of programs will incorporate the revised primer into their training, and many individuals are already reading and sharing it with their networks.

Peer review

One of the chief barriers identified in the Grand Challenges report is the lack of systematic theory and empirical evidence to design and guide legal, economic, policy and organizational interventions in scholarly publishing and in the knowledge ecosystem generally. As social scientists, Micah and I drew on this insight, and used the case of peer-review in sociology as an entry point. We presented our formative analysis of this case in the CREOS Research Talk, “Can Fix Peer Review.” Here is the summary of this talk:

Contemporary journal peer review is beset by a range of problems. These include (a) long delay times to publication, during which time research is inaccessible; (b) weak incentives to conduct reviews, resulting in high refusal rates as the pace of journal publication increases; (c) quality control problems that produce both errors of commission (accepting erroneous work) and omission (passing over important work, especially null findings); (d) unknown levels of bias, affecting both who is asked to perform peer review and how reviewers treat authors, and; (e) opacity in the process that impedes error correction and more systematic learning, and enables conflicts of interest to pass undetected. Proposed alternative practices attempt to address these concerns — especially open peer review, and post-publication peer review. However, systemic solutions will require revisiting the functions of peer review in its institutional context.

The full slides, with embedded video of the talk (minus the first few minutes) is embedded below:

Research design and intervention

Mapping out the various interventions and proposed alternatives in the peer review space raised a number of questions about how to design and evaluate interventions in a complex system with interdependent parts and actors embedded in different institutional logics — for example, university researchers (some working under state policy), research libraries, for-profit publishers, and academic societies. Working with Jessica Polka, Director of ASAPbio, we are expanding this analysis to consider a range of innovations open science. This analysis highlights the need for systematic research design that can guide the design of initiatives aimed at altering the scholarly knowledge ecosystem.

Applying the ecosystem approach in the Grand Challenges report, we consider large-scale interventions in public health and safety, and their unintended consequences, to build a model for designing projects with the intention of identifying and assessing such consequences across the system. Addressing problems at scale may have such unintended effects as leading vulnerable populations to adapt to new technology in harmful ways (mosquito nets used for fishing); providing new opportunities for harmful competitors (the pesticide treadmill); the displacement of private actors by public goods (dentists driven away by public water fluoridation); and risk compensation by those who receive public protection (anti-lock brakes and riskier driving, vaccinations). Our forthcoming white paper will address such risks in light of recent open science interventions: PLOS One, bioRxiv and preprints generally, and open peer review, among others. We combine research design methods for field experiments in social science, outcomes identified in the grand challenge report, and the ecosystem theory based on an open science lifecycle model.

ARL/SSRC meeting and Next Steps

Coming out of discussions at the first O3S meeting, in December the Association of Research Libraries and the Social Science Research Council convened a meeting on open scholarship in the social sciences, which included leaders from scholarly societies, university libraries, researchers advocating for open science, funders, and staff from ARL, SSRC, and the Coalition for Networked Information. I was fortunate to participate on the planning committee for the meeting, and in that capacity I conducted a series of short video interviews with individual stakeholders from the participating organizations to help expose us all to the range of values, objectives, and concerns we bring to the questions we collectively face in the movement toward open scholarship.

For our own work on peer review, which we presented at the meeting, I was especially drawn to the interviewees’ comments on transparency, incentives, and open infrastructure. In particular, MIT Libraries Director Chris Bourg challenged social scientists to recognize what their own research implies for the peer review system:

Brian Nosek, director of the Center for Open Science, stressed to the need to consider incentives for openness in our interventions:

And Kathleen Fitzpatrick, project director for Humanities Commons, described the necessity of open infrastructure that is flexibly interoperable, allowing parallel use by actors on diverse platforms:

These insights about intervention principles for an open scholarly ecosystem helped Micah and me develop a proposal for discussion at the meeting. Our proposed program, IOTA (I Owe The Academy) aims to solve the supply-and-demand problem for quality peer review in open science interventions (the name is likely to change). We understand that most academics are willing to do peer review when it contributes to a better system of scholarship. At the same time, new peer review projects need (good) reviewers in order to launch successfully. And the community needs (good) empirical research on the peer review process itself. The solution is to match reviewers with initiatives that promote better scholarship using a virtual token system, whereby reviewers pledge review effort units, which are distributed to open peer review projects — while collecting data for use in evaluation and assessment. After receiving positive feedback at the meeting, we will develop this proposal further.

Our presentation is embedded in full below:

A report on the ARL/SSRC meeting describes the shared interests, challenges to openness, and conditions for successful action discussed by participants. And it includes five specific projects they agreed to pursue — one of which is peer review on the SocArXiv and PsyArXiv paper platforms.

What’s next…

In the coming several months we expect to produce a white paper on research design, a proposal for IOTA, and a presentation for the Coalition for Networked Information meeting in April, to spark a discussion about the ways libraries can jointly support additional targeted work to promote, inspire, and support evidence-based research. And a revised version of the scholarly communication primer for sociology is on the way.

What do doctors, lawyers, police, and librarians Google?

Now with college teachers!

What do doctors, lawyers, police, and librarians Google? I’ll tell you. But first — if you are going to take this too seriously, please stop now.

Data and Method

Using IPUMS to extract data from the 2010-2012 American Community Survey, I count the number of people ages 25-64, currently employed, in a given occupation. I divide that by each state’s population in that age range (excluding Washington DC from all analyses). I enter those numbers into the Google Correlate tool to see which searches are most highly correlated with the distribution of each occupation across states (the tool reports the top 100 most correlated searches). In other words, these are searches that maximize the difference between, for example, high-lawyer and low-lawyer states — searches that are relatively popular where there are a lot of lawyers, and relatively unpopular where there are not a lot of lawyers.

Is this what lawyers actually Google? We can’t know. But I think so. Or maybe what people who work in law firms do, or people who live with lawyers. It’s a very sensitive tool. I made this case first in the post, Stuff White People Google. Check that out if you’re skeptical.

For each occupation, I first offer a few highly correlated searches that support the idea that the data are capturing what these people search for. Then I list some of the interesting other hits from each list.

Results

Police

Police per adult
Police per adult

The map of police per adult looks pretty random, but the list of correlated search terms doesn’t. On the list are “security training,” “tsa jobs,” “waist belt,” “weight vest,” and “air marshals.”

After all the security stuff, the only major category left in the 100 searches most correlated with police in the population is women. Specifically, their search taste includes tough actress Rachel Ticotin, body builder Denise Masino, Brazilian actress Alice Braga, actress Rosario Dawson, and, “israeli women.” (Remember, Google suppresses known porn terms, so this is just what got through the filter.) It’s a leap from this data to the statement, “police search for images of these women,” but this is who they would find if that were the case (is this a “type”?):

policewomensearches

Librarians

Librarians per adult
Librarians per adult

On the other hand, librarians. They are the smallest occupation I tried: the average state population aged 25-64 is only one tenth of one percent librarians. Yet, their distribution leaves an unmistakable trace in the Google search patterns. It especially seems to pick up terms associated with public libraries. Correlated terms include, “cataloguing,” and “quiet hours.” And then there are terms one might ask a librarian about, classic reference-desk questions such as, “which vs that,” “turn off track changes,” “think tanks,” “9/11 commission,” and “irs form 6251”; and term paper topics like Shakespeare titles or “human development report.”

What about the librarians themselves, or those close to them? Could it be they who are searching for Ann Taylor dresses, Garnet Hill free shipping, Lands End home, and textile museums? We can’t know for sure. Of course, if anyone knows how to cover their search tracks, it might be this crowd.

Doctors

Doctors per adult
Doctors per adult

You know they’re doctors, because the search terms most correlated the map include “md, mph,” “md, phd,” “nejm,” “journal medicine,” “tedmed,” and “groopman.” What else do they like? Chic Corea, Tina Fey, Larry David, Mad Men (season 1) and The West Wing, Laura Linney, John Oliver, Scrabble 2-letter words, and a bunch of Jewish stuff.

Lawyers

Lawyers per adult
Lawyers per adult

That’s the map of lawyers per adult across states. Is it really lawyers? The top 100 searches correlated with the distribution shown above include “general counsel,” and then a lot of financial terms like, “world economic forum,” “international finance corporation,” and “economist intelligence.” Then there are international travel terms, like, “rate euro dollar,” “royal air,” and “swiss embassy.”

Looks like lawyers in lawyer-land are richer and more finance-oriented than lawyers in general. On the cultural side, they search for clothing terms Massimo Dutti, Hugo Boss, and Benetton. They apparently like to eat at Zafferano in London, and drink Caipirinhas. Also, they like “vissi,” which is an aria from Tosca but also a Cypriot celebrity; I lean toward the latter, because Queen Rania is also on the list. Finally, they combine their interests in law, finance, and wealthy attractive women by searching for Debrahlee Lorenzana, the “too-hot-for-work” banker.

By popular demand: Post-secondary teachers

postsecondaryperadult

Finally, here without comment are the results for “post-secondary teachers,” which includes any college teacher who didn’t instead specify a specialty, such as “psychologist” or “economist.” (It’s hard to see on the map, but Rhode Island is the highest.) I broke the results into four rough categories:

Academic

attribution
balderdash
bmi index
body image
citation style
cpdl
critical theory
debt to equity
debt to equity ratio
democracy in america
dihedral
economic inequality
economic statistics
economists
educause
edward elgar
effect size
email forward
equals sign
exogenous
feminists
google scholar
growth rates
homomorphism
inflation rate
inflation rates
intelligibility
international study
isomorphic
journal of
journal of nutrition
marginal propensity
marginal propensity to consume
mediating
meters per second
milieu
overlaying
piano sonata
prefrontal
prefrontal cortex
profile of
psychology studies
quick ratio
rejection letter
returns to scale
routledge
scholar
subgroup
superscript
transglutaminase
ways to end a letter

Personal

1% milk
2006 olympics
best pump up songs
crib safety
easy halloween costume
graco snug
handel
ipod history
jackson superbowl
janet jackson superbowl
mastermind game
maxim online
minesweeper
most popular names
napping
national sleep foundation
olympic figure skating
olympics 2006
pairs figure skating
positioning
refereeing
sandra boynton
senior hockey
snl clips
stuff magazine
stumbled upon
toilet training
verum

Musical

1812 overture
acapella group
acapella groups
africa toto
ave verum
for the longest time
it breaks my heart
pdq bach
taylor swift

Birth control

apri
apri birth control
aviane

Conclusions

Poor social scientists, generations of them spending their lives raising a few thousand dollars to ask a few thousand people a few hundred stilted, arbitrary survey questions. Meanwhile, coursing through the cable wires below their feet, and through the air around them, billions of data bits carry so much more potential information about so many more people, in so many intimate aspects of their lives, then we could even dream of getting our hands on. Just think of the power!

RingfrodoNote: I’ve done many posts like this. Some use time series instead of geographic variation, some use terms from Google Books ngrams. Browse the series under the Google tag, or check out this selection: