Science finds tiny things nowadays (Malia edition)

We have to get used to living in a world where science — even social science — can detect really small things. Understanding how important really small things are, and how to interpret them, is harder nowadays than just finding them.

Remember when Hanna Rosin wrote this?

One of the great crime stories of the last twenty years is the dramatic decline of sexual assault. Rates are so low in parts of the country — for white women especially — that criminologists can’t plot the numbers on a chart.

Besides being wrong about rape (it has declined a lot, but it’s still high compared with most countries), this was a funny statement about science (I’ve heard we can even plot negative numbers now!). But the point is we have problems understanding, and communicating about, small things.

So, back to names.

In 2009, the peak year for the name Malia in the U.S., 1,681 girls were given that name, according to the Social Security Administration, or .041% of the 4.14 million children born that year (there are no male Malias in the SSA’s public database, meaning they have never recorded more than 4 in one year). That year, 7.5% of women ages 18-44 had a baby. If my arithmetic is right, say you know 100 women ages 18-44, and each of them knows 100 others (and there is no overlap in your network). That would mean there is a 30% chance one of your 10,000 friends of a friend had a baby girl and named her Malia in 2009. But probably there is a lot of overlap; if your friend-of-friend network is only 1,000 women 18-44 then that chance would fall to 3%.

Here is the trend in girls named Malia, relative to the total number of girls born, from 1960 to 2016:


To make it easier to see the Malias, here is the same chart with the y-axis on a log scale.


This shows that Malia has been on a long upward trend, from less than 50 per year in the 1960s to more than 1,000 per year now. And it also shows a pronounced spike in 2009, the year Malia peaked .041%. In that year, the number of people naming daughters Malia jumped 75% before declining over the next three years to resume it’s previous trend. Here is the detail on the figure, just showing the Malia in 2005-2016:


What happened there? We can’t know for sure. Even if you asked everyone why they named their kid what they did, I don’t know what answers you would get. But from what we know about naming patterns, and their responsiveness to names in the news (positive or negative), it’s very likely that the bump in 2009 resulted from the high profile of Barack Obama and his daughter Malia, who was 11 when Obama was elected.

What does a causal statement like that that really mean? In 2009, it looks to me like about 828 more people named their daughters Malia than would have otherwise, taking into account the upward trend before 2008. Here’s the actual trend, with a simulated trend showing no Obama effect:


Of course, Obama’s election changed the world forever, which may explain why the upward trend for Malia accelerated again after 2013. But in this simple simulation, which brings the “no Obama” trend back into line with the actual trend in 2014, there were 1,275 more Malias born than there would have been without the Obama election. This implies that over the years 2008-2013, the Obama election increased the probability of someone naming their daughter Malia by .00011, or .011%.

That is a very small effect. I think it’s real, and very interesting. But what does it mean for anything else in the world? This is not a question of statistical significance, although those tools can help. (These names aren’t a probability sample, it’s a list of all names given.) So this is a question for interpreting research findings now that we have these incredibly powerful tools, and very big data to analyze with them. The number alone doesn’t tell the story.

On artificially intelligent gaydar

A paper by Yilun Wang and Michal Kosinski reports being able to identify gay and lesbian people from photographs using “deep neural networks,” which means computer software.

I’m not going to describe it in detail here, but the gist of it is they picked a large sample of people from a dating website who said they were looking for same-sex partners, and an equal number that were looking for different-sex partners, and trained their computers to learn the facial features that could distinguish the two groups (including facial structure measurements as well as grooming things like hairline and facial hair). For a deep dive on the context of this kind of research and its implications, and more on the researchers and the controversy, please read this post by Greggor Mattson first. These notes will be most useful after you’ve read that.

I also reviewed a gaydar paper five years ago, and some of the same critiques apply.

This figure from the paper gives you an idea:


These notes are how I would start my peer review, if I was peer reviewing this paper (which is already accepted and forthcoming in the Journal of Personality and Social Psychology — so much for peer review [just kidding it’s just a very flawed system]).

The gay samples here are “very” gay, in the sense of being out and looking for same-sex partners. This does not mean that they are “very” gay in any biological, or born-this-way sense. If you could quantitatively score people on the amount of their gayness (say on some kind of scale…), outness and same-sex attraction might be correlated, but they are different things. The correlation here is assumed, and assumed to be strong, but this is not demonstrated. (It’s funny that they think they address the problem of the sample by comparing the results with a sample from Facebook of people who like pages such as “I love being gay” and “Manhunt.”)

Another way of saying this is that the dependent variable is poor defined, and then conclusions from studying it are generalized beyond the bounds of the research. So I don’t agree that the results:

provide strong support provide strong support for the PHT [prenatal hormone theory], which argues that same-gender sexual orientation stems from the underexposure of male fetuses and overexposure of female fetuses to prenatal androgens responsible for the sexual differentiation of faces, preferences, and behavior.

If it were my study I might say the results are “consistent” with PHT theory, but it would be better to say, “not inconsistent” with the theory. (There is no data about hormones in the paper, obviously.)

The authors give too much weight to things their results can’t say anything about. For example, gay men in the sample are less likely to have beards. They write:

nature and nurture are likely to be as intertwined as in many other contexts. For example, it is unclear whether gay men were less likely to wear a beard because of nature (sparser facial hair) or nurture (fashion). If it is, in fact, fashion (nurture), to what extent is such a norm driven by the tendency of gay men to have sparser facial hair (nature)? Alternatively, could sparser facial hair (nature) stem from potential differences in diet, lifestyle, or environment (nurture)?

The statement is based on the faulty premise that they are “nature and nurture are likely to be as intertwined.” They have no evidence of this intertwining. They could just as well have said “it’s possible nature and nurture are intertwined,” or, with as much evidence, “in the unlikely event nature and nurture are intertwined.” So they loaded the discussion with the presumption of balance between nature and nurture, and then go on to speculate about sparse facial hair, for which they also have no evidence. (This happens to be the same way Charles Murray talks about race and IQ: there must be some intertwining between genetics and social forces, but we can’t say how much; now let’s talk about genetics because it’s definitely in there.)

Aside from the flaws in the study, the accuracy rate reported is easily misunderstood, or misrepresented. To choose one example, the Independent wrote:

According to its authors, who say they were “really disturbed” by their findings, the accuracy of an AI system can reach 91 per cent for homosexual men and 83 per cent for homosexual women.

The authors say this, which is important but of course overlooked in much of the news reporting:

The AUC = .91 does not imply that 91% of gay men in a given population can be identified, or that the classification results are correct 91% of the time. The performance of the classifier depends on the desired trade-off between precision (e.g., the fraction of gay people among those classified as gay) and recall (e.g., the fraction of gay people in the population correctly identified as gay). Aiming for high precision reduces recall, and vice versa.

They go on to give a technical, and I believe misleading example. People should understand that the computer was always picking between two people, one of whom was identified as gay and the other not. It had a high percentage chance of getting that choice right. That’s not saying, “this person is gay”; it’s saying, “if I had to choose which one of these two people is gay, knowing that one is, I’d choose this one.” What they don’t answer is this: Given 100 random people, 7 of whom are gay, how many would the model correctly identify yes or no? That is the real life question most people probably think the study is answering.

As technology writer Hal Hodson pointed out on Twitter, if someone wanted to scan a crowd and identify a small number individuals who were likely to be gay (and ignoring many other people in the crowd who are also gay), this might work (with some false positives, of course).


Probably someone who wanted to do that would be up to no good, like an oppressive government or Amazon, and they would have better ways of finding gay people (like at pride parades, or looking on Facebook, or dating sites, or Amazon shopping history directly — which they already do of course). Such a bad actor could also train people to identify gay people based on many more social cues; the researchers here compare their computer algorithm to the accuracy of untrained people, and find their method better, but again that’s not a useful real-world comparison.

Aside: They make the weird but rarely-necessary-to-justify decision to limit the sample to White participants (and also offer no justification for using the pseudoscientific term “Caucasian,” which you should never ever use because it doesn’t mean anything). Why couldn’t respondents (or software) look at a Black person and a White person and ask, “Which one is gay?” Any artificial increase in the homogeneity of the sample will increase the likelihood of finding patterns associated with sexual orientation, and misleadingly increase the reported accuracy of the method used. And of course statements like this should not be permitted: “We believe, however, that our results will likely generalize beyond the population studied here.”

Some readers may be disappointed to learn I don’t think the following is an unethical research question: Given a sample of people on a dating site, some of whom are looking for same-sex partners and some of whom are looking for different-sex partners, can we use computers to predict which is which? To the extent they did that, I think it’s OK. That’s not what they said they were doing, though, and that’s a problem.

I don’t know the individuals involved, their motivations, or their business ties. But if I were a company or government in the business of doing unethical things with data and tools like this, I would probably like to hire these researchers, and this paper would be good advertising for their services. It would be nice if they pledged not to contribute personally to such work, especially any efforts to identify people’s sexual orientation without their consent.


The sky is falling because of feminist biology, Factual Feminist edition

The other day I explained why, despite her mocking tone,  the “Factual Feminist” (Christina Sommers) doesn’t have the factual basis to undermine commonly-used statistics on rape. Now she has a video out on “feminist science.” No, it’s not a joke from The Simpsons, she says:

A new feminist biology program at the University of Wisconsin is all too real… Is feminist biology likely to contribute to our knowledge and understanding of the world? The Factual Feminist is skeptical.

The program in question is really just a post-doctoral fellowship. It looks like a privately-endowed fund to hire one postdoc. This is not a major curriculum intervention. The first postdoc in the program is Caroline VanSickle, a biological anthropologist from the University of Michigan who does work on ancient female pelvic bones and their implications for birth stuff. She was quoted by the right-wing Campus Reform (a project of the Leadership Institute) this way:

“We aren’t doing science well if we ignore the ideas and research of people who aren’t male, white, straight, or rich,” VanSickle said in an email to Campus Reform. “Feminist science seeks to improve our understanding of the world by including people with different viewpoints. A more inclusive science means an opportunity to make new discoveries.”

I don’t know the evidence on whether the ideas of biologists who aren’t male, White straight, or rich are ignored in science today, but this sentiment seems unobjectionable to me – we aren’t doing science well if we ignore anyone’s (good) ideas. Who could object to “including people with different viewpoints”? But Sommers, for some reason misquoting her only source for the story, says,

She explained to Campus Reform that, quote, in order to do science well, she said, we can’t ignore the ideas and research of people who just don’t happen to be male. But wait a minute. Women are hardly ignored in biology. In fact, they have far surpassed men in earning biology degrees. What is more, women are flourishing, and winning Nobel Prizes in that field.

On the screen flashes a table showing women getting 61% of BA degrees in biology, 59% of MAs, and 54% of PhDs. If we’re talking about whether women are ignored in biology, I think it’s the PhDs that matter, so 54% is not quite “far surpassed.” More to the point, although women first surpassed men in receiving biology BA degrees in 1988 — a quarter of a century ago — they are currently only 23% of full professors in biology. I’m not arguing about whether this reflects job discrimination against female biologists. The point is that if only a small minority of the most influential biologists are women, and if there are common differences in how men and women do biology, then the views of the latter are going to be less well represented.

To show overblown this worry is, Sommers then flashes this image of all those women winning Nobel Prizes in “that field” (actually the prizes are for “Physiology and Medicine,” since there is no Nobel for biology):


Those women sure seem to be flourishing. And that’s every woman who ever won a Nobel in Physiology and Medicine — all 10 of them. Since the 1940s, when the first of these women flourished, men have been awarded 162 Nobels in that field — the other 94% of the prizes. The peak decade was in the 2000s, when women won 15% of the prizes (the most recent in 2009).

At Wisconsin, the single “feminist biology” postdoc will also develop an undergraduate course in gender and biology. This seems like a fine idea. Maybe it will encourage even more women to overrun the biological sciences. Call me naive, but we’re still not exactly drowning in female biologists.

After going on to pick on a few individual feminists, Sommers concludes that:

…feminist theory [has] been built on a foundation of paranoia about the patriarchy, half-truths, untruths, oversimplifications, and it’s immune to correction.

Raising the question: If feminism is rubber, and the Factual Feminist is glue, does what she say bounce of feminism and stick to her?

Full disclosure: My mother is a biologist. And a feminist. So you know I’m right. And objective.


Fundamentally opposed to science?

Conservative religious fundamentalists really don’t trust the scientific establishment.

In the discussion of academia’s liberalism, we should also consider the public’s mistrust of science, especially the conservative and fundamentalist public. Why would people who don’t trust science become scientists?

Last year Gordon Gauchat reported in American Sociological Review that Americans’ trust in the scientific community was holding steady except for political conservatives and those who attend church regularly, and that the trend was not explained by the lower education levels of conservatives or religious people (in fact, educated conservatives expressed the lowest levels of trust in science). His conclusion was that the trend showed the politicization of science, which is not the way modernity is supposed to go.

In response, Darren Sherkat blogged that Gauchat underestimated the importance of religion in explaining conservatives’ opposition to science because he only used the General Social Survey’s measure of the frequency of religious attendance instead of a measure of beliefs. And he provided a chart from the GSS showing that religious fundamentalists had lower trust in science whether they were Republicans or not. Sherkat wrote:

Any social scientist who studies politics, religion, and science should know that the reason why Republicans are at war against science is to court the vote of fundamentalist Christian simpletons who are opposed to science and reason. … What drives Republican opposition to science is that more Republicans are fundamentalists who believe that the Bible is the literal word of god.

You got your fundamentalism in my conservatism

As I look at it, conservatism and fundamentalism are both at fault. My take on the trends shows that, in addition to the growing divide between politically conservative fundamentalists and politically liberal non-fundamentalists, liberal fundamentalists have grown more trusting of science, while conservative non-fundamentalists have grown less trusting.

I used the GSS from 1974 through the latest 2012 survey. To highlight the polarization I show only those who are “extremely liberal,” “liberal,” “conservative,” or “extremely conservative,” leaving out those who are “slightly” liberal or conservative, or moderate. So this is not the whole population (I’ll return to that below).

The question was:

I am going to name some institutions in this country. As far as the people running these institutions are concerned, would you say you have a great deal of confidence, only some confidence, or hardly any confidence at all in them? … Scientific community.

It’s as close as we get to a question about science itself. For fundamentalism, GSS asked whether the respondent’s religion was fundamentalist, moderate, or liberal. I dichotomized it to fundamentalists versus everyone else (including people with no religion).*

These are the people expressing a great deal of confidence in the scientific community:

confidence-in-scienceThese trends are heavily smoothed (down to four decades), because the numbers bounce around a lot from year to year, as the samples are only between 60 and 220 in each cell in the individual years. To do a simple test of the trends, I ran a regression using time and interactions between time and politics-fundamentlism dummy variables, with controls for age and sex (old people and men hate science more than regular people, net of religion and politics).

The regression confirms what the graph shows: significant declines in trust among conservatives whether fundamentalist or not, and an increase in trust among liberal fundamentalists. The trend for liberal non-fundamentalists was flat. (Details on request.)

I left out of that analysis the people who were slightly conservative, moderate, or slightly liberal. That’s a shrinking majority of the population, which breaks down like this from the 1970s to the last decade (click to enlarge):

confidence-in-science-popsSo the bad news for science is that the increasingly anti-science groups are increasing in the population: conservative fundamentalists and non-fundamentalists. The big green majority is not growing more or less anti-science (even when you break it down by fundamentalism), but it’s also shrinking. The liberal fundamentalists are getting more into science, but also vanishing.

Just wait till they find out (some) sociology is part of the “scientific community.”

Note: This is a blog-post, not peer-reviewed research. I might be wrong.

* Skerkat instead uses a question about how to interpret the Bible instead of the fundamentalism question (literal word of God, inspired word of God, book of fables). 95% of the people who described themselves as having a “fundamentalist” describe the Bible as either the literal or the inspired word of God.




Responses on fatherhood: hormones, science and god

The fatherhood post yesterday has gotten (for this blog), a lot of readers and some interesting responses. As I wrote out some extended, disorganized comment responses, I realized I may as well elevate them to an independent post (still a disorganized rant though).

I like the discussion by the authors on the Scientific American blog suggested by szopeno. Like I said in the original post, it’s quite reasonable that caring behavior affects hormone levels, as we know things like stress and fear do as well, with all kinds of mental and physical effects. If you randomly subjected some people to competitive athletic coaching, and handed others an infant, I wouldn’t be surprised to see the competition people behaving more aggressively and the baby-holders being more nurturing on average three months later. That would be interesting.

What is the implication? Are we shocked that some aspects of fatherhood (or childcare or sex) provoke a “biological” response? If that shocks you, you might like to know that by simply showing people pictures of other people behaving in certain ways, their bodies are are more likely to undergo spontaneous physical transformations. Just from sitting there looking at pictures! Also, if you inject an athlete with testosterone he can ride his bike really fast.


It does not follow from these findings of a hormonal response to life events that we should promote certain family arrangements as “natural,” which is where Wilcox and the religious-sociological-complex is taking this. If the goal is to change men’s testosterone levels, that might be done with medication. If the goal is to reduce aggressiveness, try teaching meditation in public schools. If we want people to be better parents, we can give them jobs, healthcare, housing and childcare support.

We have lots of ways of trying to promote happiness and pro-social behavior. However, like the crazy list of potential risks and side effects for men taking low-T medication, there are consequences to any such intervention.

Fortunately for individual freedom and human rights, some of us know that we can punish or prevent bad behavior — and reward or encourage good behavior — without attacking or rewarding whole status categories of people. Children with rich, married, college-educated parents are more likely to get into and finish college. So, we ought to fund a public school system, fund student loans for college — and also protect the children of the evil, sick or ineffective rich, married, college-educated parents from harm. But that doesn’t mean we should sterilize poor people.

So, is fatherhood good?

It’s not a question with one answer. One of the things Wilcox and the family “gold standard” promoters do is find ways that people in “traditional” families are doing better on average and use that to promote family conformity. But the averages conceal the sources of variation. Comparing the average father to the average non-father won’t tell you much about how fatherhood affects men because fatherhood occurs along with so many combinations of other transitions, experiences and resources. If you randomly assigned fatherhood to random men — at random moments in their lives — you could come up with an answer. Otherwise I’m not optimistic, and if it’s not answerable I doubt it’s a good question for social science.

Imagine three sets of outcomes: money, happiness and healthiness. Each is affected by social background and context. Then consider men entering fatherhood with different levels of each beforehand, and see how each outcome changes for all the different combinations (e.g., income changes for rich, happy, healthy; income changes for poor, happy, healthy; etc.). The possibilities multiply. If you’re Brad Wilcox you can work back from your goal — married nuclear families — and compare them to everyone else to cherry-pick any worse outcome at any time, and lo, discover that the Bible was right after all. If you really want to know it’s not so easy.

I haven’t yet read Doing the Best I Can: Fatherhood in the Inner City, the new book by Kathryn Edin and Timothy J. Nelson, but that seems promising for an in-depth look at fatherhood in the flow of men’s lives, with a lot of attention to the social context (education, employment, incarceration, complex families and relationships, etc.).


Don’t take my Word for it

If you start from a God-given definition of what’s good, and science can’t change that, then science becomes just a convenient way of explaining what you already knew, which is not science — it’s what the Church calls “natural law.”

Wilcox denies that’s how it works, naturally. At a conference on the family and natural law, he was quoted as saying,

Our support for the renewal of marriage is not predicated on some … religious worldview. Rather, it’s based on a reasonable understanding of the human condition that is accessible to all men and women of good will. … Evidence suggests to us that intact, biological marriage is still the gold standard.

That depends on what you mean by “predicated.” Years before the “Regnerus affair,” during which Mark Regnerus joined Wilcox in a scheme to use science against marriage equality in the courts, Regnerus gave his view of the importance of (a certain kind of) marriage, and it did not originate from his scientific training:

The importance of Christian marriage as a symbol of God’s covenantal faithfulness to his people—and a witness to the future union of Christ and his bride—will only grow in significance as the wider Western culture diminishes both the meaning and actual practice of marriage. Marriage itself will become a witness to the gospel.

That divine law and natural law do not conflict is an article of faith (literally). I think Pope Leo XIII put it well when he wrote:

Now, reason itself clearly teaches that the truths of divine revelation and those of nature cannot really be opposed to one another, and that whatever is at variance with them must necessarily be false. Therefore, the divine teaching of the Church, so far from being an obstacle to the pursuit of learning and the progress of science, or in any way retarding the advance of civilization, in reality brings to them the sure guidance of shining light.

Thus, rather than see science as a candle in the dark, natural law says that science needs a candle in the dark, and God has one. Could any research penetrate that mindset? If your research contradicts the “truths of divine revelation” then your research is wrong. Try again! Science in this vein is just looking for ways to convince secular society that the Church is already right. It is a self-fulfilling prophecy (literally). From the same natural law conference news report:

While there’s limited data on the effects of same-sex marriage on children, Wilcox hypothesized that in a few years, research will show that children in lesbian or gay family situations will exhibit some of the same problems as children from father-less or cohabiting relationships.

That conference was in January 2011. At that point Wilcox already had the New Family Structure Study machinery in motion, which would end up confirming to the faithful what they already knew.


Take it from the Pope


For the “World Day of Peace,” which is today, instead of congratulating the newly weds — who are upholding the transformed but still living (for better or worse, in sickness and in health) institution of marriage — Pope Benedict (Ratzinger) issued a statement that included this about homogamous marriage:

There is also a need to acknowledge and promote the natural structure of marriage as the union of a man and a woman in the face of attempts to make it juridically equivalent to radically different types of union; such attempts actually harm and help to destabilize marriage, obscuring its specific nature and its indispensable role in society.

These principles are not truths of faith, nor are they simply a corollary of the right to religious freedom. They are inscribed in human nature itself, accessible to reason and thus common to all humanity. The Church’s efforts to promote them are not therefore confessional in character, but addressed to all people, whatever their religious affiliation. Efforts of this kind are all the more necessary the more these principles are denied or misunderstood, since this constitutes an offence against the truth of the human person, with serious harm to justice and peace.

I’m not enough of a Pope-ologist to know how rare this is, but what struck me was his claim that his opinion is “accessible to reason and thus common to all humanity.”

There is a convention in the U.S. that we can criticize each other’s opinions, but it’s impolite to criticize each other’s beliefs (as long as those beliefs are religious, meaning not too recent in origin). So it’s fine for me to say that you are wrong about secular subjects, like physics and sports, but it’s impolite to say you are wrong if you believe that God speaks directly to you or that cavemen played with dinosaurs. Or, more directly relevant to the Pope, scientists can say that virgin conception is generally unlikely, but it would be impolite to say it never ever happened, not even once.

Anyway, that’s a long way of getting around to the point that I find the Pope’s statement galling. If he wants to express political opinions, fine. I have no objection to that as long as the giant, multibillion-dollar real estate and educational empire he runs isn’t tax exempt.

But if he’s going to make statements with that hat on — that is, subject to a declaration of infallibility* — he should lay off the social-science proclamations. If he wants to argue in the realm of reason, rather than faith, then we may weigh his record of expressed belief in fairy tales against his scientific credibility.

Believe it or not

Learning as I go here: turns out the Pope has a whole scientific academy called the Pontifical Academy of Sciences (where the “peer review” is not done by your peers, if you know what I mean). And naturally they’ve been all over this subject of reason and faith. I read a 2006 talk titled “Secularism, Faith and Freedom,” which was apparently presented to this audience:


And I thought the American Sociological Association conference was a dynamic scene!

The paper says it’s necessary for religious people to argue their positions freely in a secular state’s public square. These positions include, “Faith is the root of freedom,” and “a proper secularism requires faith.” That is because liberal democracy otherwise is a moral vacuum of pragmatic consumerism with no higher purpose. So I gather that, just as any “gaps” in the fossil record summon Creation as an explanation, so does any lack of morality in the public sphere demand to be filled by faith — specifically, a “Creator who addresses us and engages us before ever we embark on social negotiation.” Absent that presence, “the liberal ideal becomes deeply anti-humanist.”

Although, after reading this whole paper and the Pope’s statement, I confess (my word choice) that I’m not sure “humanist” is really what they’re going for.

Can. 749 §1. By virtue of his office, the Supreme Pontiff possesses infallibility in teaching when as the supreme pastor and teacher of all the Christian faithful, who strengthens his brothers and sisters in the faith, he proclaims by definitive act that a doctrine of faith or morals is to be held.


