Skip to content

Stan at NIPS 2014

For those in Montreal a few of the Stan developers will giving talks at the NIPS workshops this week.  On Saturday at 9 AM I’ll be talking about the theoretical foundations of Hamiltonian Monte Carlo at the Riemannian Geometry workshop ( while Dan will be talking about Stan at the Software Engineering workshop ( Saturday afternoon at 4 PM.  We’ll also have an interactive poster at the Probabilistic Programming workshop on Saturday (*2014_Workshop) — it should be an…attractive presentation.


If you’re up early be sure to check out Matt Hoffman talking first thing on Saturday, at 8:30 AM in the Variational Inference workshop (

Dan and I will be around Thursday night and Friday if anyone wants to grab a drink or talk Stan.

The inclination to deny all variation


One thing we’ve been discussing a lot lately is the discomfort many people—many researchers—feel about uncertainty. This was particularly notable in the reaction of psychologists Jessica Tracy and Alec Beall to our “garden of forking paths” paper, but really we see it all over: people find some pattern in their data and they don’t even want to consider the possibility that it might not hold in the general population. (In contrast, when I criticize these studies, I always make it clear that I just don’t know, that their claim could hold in general, I just don’t see convincing evidence.)

The story seems pretty clear to me (but, admittedly, this is all speculation, just amateur psychology on my part): in general, people are uncomfortable with not knowing and would like to use statistics to create fortresses of certainty in a dangerous, uncertain world.

Along with this is an even more extreme attitude, which is not just to deny uncertainty but to deny variation. We see this sometimes in speculations in evolutionary psychology (a field where much well-publicized work can be summarized by the dictum: Because of evolutionary pressures, all people are identical to all other people, except that all men are different from all women and all white people are different from all black people [I’ve removed that last part on the advice of some commenters; apparently my view of evolutionary psychology has been too strongly influenced by the writings of Satoshi Kanazawa and Nicholas Wade.]). But even in regular psychology this attitude comes up, of focusing on similarities between people rather than differences. For example, we learn from Piaget that children can do X at age 3 and Y at age 4 and Z at age 5, not that some children go through one developmental process and others learn in a different order.

We encountered an example of this recently, which I wrote up, under the heading, “When there’s a lot of variation, it can be a mistake to make statements about ‘typical’ attitudes.” My message there is that sometimes variation itself is the story, but there’s a tendency among researchers to express statements in terms of averages.

But then I recalled an even more extreme example, from a paper by Phoebe Clarke and Ian Ayres that claimed that “sports participation [in high school] causes women to be less likely to be religious . . . more likely to have children . . . more likely to be single mothers.” In my post on this paper a few months ago, I focused on the implausibility of the claimed effect sizes and on the problems with trying to identify individual-level causation from state-level correlations in this example. At the time I recommended they give their results a more descriptive spin, both in their journal article and in their mass-media publicity.

But there was one other point that came up, which I wrote about in my earlier post but want to focus on here. The article by Clarke and Ayres includes the following footnote:

It is true that many successful women with professional careers, such as Sheryl Sandberg and Brandi Chastain, are married. This fact, however, is not necessarily opposed to our hypothesis. Women who participate in sports may “reject marriage” by getting divorces when they find themselves in unhappy marriages. Indeed, Sheryl Sandberg married and divorced before marrying her current husband.

This footnote is a striking (to me) example of what Tversky and Kahneman called the fallacy of “the law of small numbers”: the attitude that patterns in the population should appear in any sample, in this case even in a sample of size 1. Even according to their own theories, Clarke and Ayres should not expect their model to work in every case! The above paragraph indicates that they want their theory to be something it can’t be; they want it to be a universal explanation that works in every example. Framed that way, this is obvious. My point, though, is that it appears that Clarke and Ayres were thinking deterministically without even realizing it.

Don’t believe everything you read in the (scientific) papers


A journalist writes in with a question:

This study on [sexy topic] is getting a lot of attention, and I wanted to see if you had a few minutes to look it over for me . . .

Basically, I am somewhat skeptical of [sexy subject area] explanations of complex behavior, and in this case I’m wondering whether there’s a case to be made that the researchers are taking a not-too-strong interaction effect and weaving a compelling story about it. . . .

Anyway, obviously not expecting you to chime in on [general subject area], but thoughts on whether the basic stats are sound here would be appreciated.

The paper was attached, and I looked at it.

My reply to the journalist:

I’ll believe it when I see a pre-registered replication. And not before.

P.S. Just to be clear, I wouldn’t give that response to every paper that is sent to me. I’m convinced by lots and lots of empirical research that hasn’t been replicated, pre-registered or otherwise. But in this case, where there’s a whole heap of possible comparisons, all of which are consistent with the general theory being expressed, I’m definitely concerned that we’re in “power = 0.06″ territory.

Bayesian Cognitive Modeling Models Ported to Stan

Hats off for Martin Šmíra, who has finished porting the models from Michael Lee and Eric-Jan Wagenmakers’ book Bayesian Cognitive Modeling  to Stan.

Here they are:

Martin managed to port 54 of the 57 models in the book and verified that the Stan code got the same answers as BUGS and JAGS.


Buggy-whip update


On 12 Aug I sent the following message to Michael Link, president of the American Association for Public Opinion Research.  (I could not find Link’s email on the AAPOR webpage but I did some googling and found an email address for him at

Dear Dr. Link:

A colleague pointed me to a statement released under your name criticizing non-probability opt-in surveys in which you wrote that “these methods have little grounding in theory.”  Can you please explain what you mean here?  As a statistician and political scientist who has worked in survey research for over twenty years, this statement of yours makes no sense to me. I’ve written about this here:
and here (with my colleague David Rothschild):
but I thought it would make sense to ask you directly what you are getting at.  The problem I have is that, when probability samples have 90% nonresponse rates, it’s not clear that they have any “grounding in theory” either?  But in your statement you seem to be implying that traditional polls (such as those conducted by Gallup) have some “grounding in theory” that non-probability sampled polls (such as those conducted by YouGov) do not.  And I can’t see where you’re getting that from.  Some references to the relevant theory would help, perhaps.

Thanks much.

Andrew Gelman

I received no response.  If any of you know Michael Link, perhaps you could contact him directly?

I get frustrated when people don’t respond to my queries.  Just to be clear, I’m not saying that Link has any duty or obligation to respond to me:  I’ve done some service for AAPOR on occasion but I’m not a member, and I’m sure he has better things to do at work than to respond to requests for references from statisticians.

On the other hand, Link did stick his neck out and make a strong claim about the the theory of survey sampling, so I assume he’d be interested in either backing up his claim with evidence, or backing down from his claim, now that it’s been questioned by an economist and a statistician who work in survey research.

Unless I hear further, I’ll have to assume that Link does not actually have any evidence to back up his claims regarding the theoretical grounding of various survey methods, and I’ll have to continue to assume that the official statement which he signed, badmouthing non-probablility opt-in surveys, is just a collection of fine words with no theoretical grounding.

I’m happy to be corrected, though, so if anyone can contact Michael Link or whoever wrote that statement and find out what they meant, I’d be interested in hearing what they have to say.

That’s the scholarly way:  we consider our claims critically and back them up with evidence, we don’t just do drive-by criticism.

P.S.   I see from this news article by Paul Voosen that Link “regrets some of the language chosen for the letter. It was meant as a caution to the public—especially news outlets—and not as a condemnation of the research, he says. ‘Maybe the statement could have been a little clearer.'”  I’m not sure exactly what he means by this since I don’t see that he’s released an updated statement, but perhaps “Maybe the statement is could have been a little clearer” is bureaucrat-ese for “Whoops—we made some false statements here.” Or maybe he really does have some research on “grounding in theory” that he can share with us. We’ll see. The comment box remains open, or he could just send me an email.

P.P.S. Commenter G. H. posts some links to reports from 2011 and 2012 on problems with online surveys. I don’t see anything useful in these papers on “theoretical grounding” (yes, the first paper linked by G. H., from Langer Research Associates, says that opt-in surveys “operate outside the realm of inferential statistics, meaning there is no theoretical basis on which to conclude that they produce valid and reliable estimates of broader public attitudes or behavior,” but it’s hard to make much of this criticism, relative to telephone surveys, in an era where the latter are subject to 91% nonresponse rates), but they do present some evidence from 2007-2011 on empirical inaccuracies of estimates from opt-in online surveys. I do think that, in doing such surveys, it can be necessary to put in some effort to adjust using poststratification. I don’t see this as an issue of “theoretical grounding” but it is a real practical concern. And of course if traditional telephone surveys had no problems we’d all be continuing to use them. But they do have problems, and we’ve been doing a lot of research on survey adjustment, and . . . I still see no evidence regarding the point about “theoretical grounding.”

Steven Pinker on writing: Where I agree and where I disagree


Linguist and public intellectual Steven Pinker recently published an article, “Why Academics Stink at Writing.” That’s a topic that interests me! Like Pinker, I’ve done a lot of writing, both for technical and general audiences. Unlike Pinker, I have not done research on linguistics, but I’ll do my best to comment based on my own experiences.

Pinker begins as follows:

Together with wearing earth tones, driving Priuses, and having a foreign policy, the most conspicuous trait of the American professoriate may be the prose style called academese. . . . No honest professor can deny that there’s something to the stereotype. . . . But the familiarity of bad academic writing raises a puzzle. Why should a profession that trades in words and dedicates itself to the transmission of knowledge so often turn out prose that is turgid, soggy, wooden, bloated, clumsy, obscure, unpleasant to read, and impossible to understand?

I’ll return at the end to the bit about “having a foreign policy”—this is the sort of laugh line that I think works better in a live speech than in a written article—but first I will discuss the ways in which I agree with Pinker’s claim that academic writing is difficult, and how I disagree with his explanations.

Where I agree

Pinker puts it well when he writes:

Fog comes easily to writers; it’s the clarity that requires practice. The naïve realism and breezy conversation in classic style are deceptive, an artifice constructed through effort and skill.

Writing is non-algorithmic. Just about every sentence I write, I need to reconfigure for the purpose of increasing clarity.

And, yes, I realize that the previous sentence is ugly; that’s actually part of my point, that when we put in the effort to make our sentences clearer, they can get ugly, and the sentences’ ugliness then gets in the way of understanding.

That’s part of what makes writing non-algorithmic: even when we know what we want to say, it can take lots of iterations to get there.

And I agree with Pinker that the lack of good feedback is a problem. Academics, like most other people, don’t get a lot of direct or indirect comments on their writing style, so they don’t learn well what has worked and what has not worked or how to do better.

Where I disagree

OK, so you all know about Sturgeon’s law (see above image).

To put it in the context of Pinker’s article: Why do academics stink at writing? Why does almost everybody stink at writing? Writing is hard.

Pinker writes:

But the familiarity of bad academic writing raises a puzzle. Why should a profession that trades in words and dedicates itself to the transmission of knowledge so often turn out prose that is turgid, soggy, wooden, bloated, clumsy, obscure, unpleasant to read, and impossible to understand?

And also this:

A third explanation shifts the blame to entrenched authority. People often tell me that academics have no choice but to write badly because the gatekeepers of journals and university presses insist on ponderous language as proof of one’s seriousness. This has not been my experience, and it turns out to be a myth. In Stylish Academic Writing (Harvard University Press, 2012), Helen Sword masochistically analyzed the literary style in a sample of 500 scholarly articles and found that a healthy minority in every field were written with grace and verve.

The above seems completely consistent with the notion that it’s difficult to write well, that academics, just like other people, would like to write well but they don’t really know how.

Partly because the path to writing well is not so clear. If it were clear, we’d all have learned to write well, back in high school.

Also there’s the problem with feedback, as discussed above.

Why is academic writing so bad, and why is this such a surprise to Pinker?

In short, I think most academic writing is bad for the same reason that most writing is bad: because writing is hard. It’s difficult to write clearly, it takes effort and it takes practice, and, on top of all that, many people don’t see the path from bad writing to good writing.

But many people have to write, as part of their job. I don’t mean this cynically, in a “publish or perish” sort of way. I mean that if you do research scholarship, you want to convey this to others, and writing is the most direct way to do this. (Maybe at some point we’ll shift to papers being delivered as Youtube mini-lectures, but we’re not there yet.)

So, to me, the problem is simple. Writing is hard, it’s hard to learn and it’s hard to teach, but lots of people use writing to express their ideas. Academics are expected to write well but they’ve never learned how.

The next question, then, is why is Pinker so surprised? Why does need so many pages to make this point? I’m not sure, but I wonder if he’s forgotten how much work it’s taken him to learn to write fluidly. Writing in a direct voice is easy for him, so it’s natural for him to think that it would be just as easy for other professors to write well, if only they would clear their heads.

For example, Pinker writes:

It’s easy to see why academics fall into self-conscious style. Their goal is not so much communication as self-presentation—an overriding defensiveness against any impression that they may be slacker than their peers in hewing to the norms of the guild. Many of the hallmarks of academese are symptoms of this agonizing self-­consciousness . . .

Sure, defensiveness is part of it. But I suspect that lots and lots of professors (and others) would write more directly, if they just got some feedback on how to do it. It’s my impression that we write prose the way we write code, by working from templates, snapping together segments from different places, etc. And this leads to what looks to Pinker like a self-conscious style but looks to me just like awkwardness, the literary equivalent of someone showing up to a formal event wearing ill-fitting clothes from Sears.

That bit about professors “having a foreign policy”

As promised, here’s my reaction to the very first bit of Pinker’s article, where he characterizes professors for “wearing earth tones, driving Priuses, and having a foreign policy.”

The profs I know don’t wear earth tones and don’t drive Priuses so I don’t really have anything to say about that, except that I guess I don’t hang out with the right class of professor—maybe I need to spend more time in Cambridge?—but I do have a comment on the “foreign policy” line.

My reaction is: what’s so funny about professors having views on foreign policy? We live in a democracy, and we all have a right to express our views. I teach in the political science department and many of my colleagues have expertise in foreign policy. But, even for profs who have no particular knowledge in this area, they’re still citizens (if not of the U.S., then of some other country).

Let me put it another way. Why is it so laughable that professors express their views on foreign policy and even try to affect policy? Like it or not, we do have public participation in this country, and I see no good reason why active voice on foreign policy should be restricted to the likes of David Brooks, Michael Moore, and whatever companies and P.R. firms happen to be lobbying in Washington, D.C., right now.

Speaking both as a political scientist and as a citizen, I think political participation should be encouraged, not mocked.

P.S. Pinker points out that I mischaracterized what he wrote. He did describe “having a foreign policy” as one of the four “most conspicuous trait[s] of the American professoriate,” but nowhere did he say there was anything wrong or even funny about having a foreign policy.

I was reading the “having a foreign policy” quote as a mockery because of how it is phrased (I’m in agreement with commenter Daniel Lakeland here) but it’s true that Pinker is not saying anything about this directly.

And of course it’s just a throwaway line in his article; it just bothers me because I object more generally when people disparage or mock political participation. In this case, though, Pinker is alluding to a stereotype rather than expressing a position himself, so to get annoyed at him for the “foreign policy” comment is a bit of blaming of the messenger.

On deck this week

Mon: Steven Pinker on writing: Where I agree and where I disagree

Tues: Buggy-whip update

Wed: The inclination to deny all variation

Thurs: The Fallacy of Placing Confidence in Confidence Intervals

Fri: Saying things that are out of place

Sat: Don’t, don’t, don’t, don’t . . . We’re brothers of the same mind, unblind

Sun: I like the clever way they tell the story. It’s a straightforward series of graphs but the reader has to figure out where to click and what to do, which makes the experience feel more like a voyage of discovery.

Who should write the new NYT chess column?

Matt Gaffney gives these “three essential characteristics” for writing “a relevant, interesting weekly chess column” in 2014:

1. It must be written by someone who is deeply involved in the chess world. Summaries of information that is already available online won’t cut it anymore. And since newspapers can’t afford to send columnists around the world to cover these big events firsthand, you need someone who’s already there.

2. They have to be world-class players, either past or present. Most likely past, since you won’t find too many active top players willing to spend playing and preparation time writing a weekly column for a general audience. But a great player’s personal experiences and ability to draw comparisons with players and games of yore is as essential to interpreting current chess events as it is in any other game or sport.

3. The person needs to be an engaging writer, highly opinionated, and preferably a bit of a character. Chess readers want informed, strong, and amusing opinions on events in the chess world, not just the who, what, when, and where. Experience writing a weekly column is a huge plus as well.

I agree with the spirit of these comments, and I respect his journalistic experience. But I disagree with most of what he has to say! To get a sense of where I’m coming from, see here; it’s the story of how I enjoy reading a chess blog written by players who are much much better than me, while being much much worse than world-class.

So let me go through Gaffney’s points 1-3:

1. I don’t think the author of this hypothetical column needs to be an insider. He or she just needs to care about the game. Sure, an insider perspective would be valuable, but look at the numbers: there are many thousands of thoughtful outsiders, not so many insiders (and many of them would, I assume, feel too constrained to give their most interesting opinions on the issue).

2. I don’t think the author of this hypothetical column needs to be a world-class player. Yes, “a great player’s personal experiences” would be great, but they’re hardly necessary. Indeed, I think Gaffney himself would realize this upon reflection. After all, he writes that these experiences and ability are “as essential to interpreting current chess events as it is in any other game or sport.” But many—perhaps most—of the great sportswriters were not world-class athletes. Red Smith? A. J. Liebling (ha!)? Bill James? Roger Angell? You get the picture. Great sportswriters are typically great writers. To say that the Times chess columnist has to be a world-class player makes as much sense as to say that the Times baseball columnist has to be able to hit (or have been able to hit) a major-league curveball.

OK, don’t get me wrong, there are differences between a spectator sport such as baseball or boxing, and a participatory sport such as chess, or judo, say. I could imagine someone being an OK baseball writer with no experience at all playing the game, whereas I can’t imagine being interested in a chess column written by a non-player (or, for that matter, by someone who can only play as well as I can). I don’t think the columnist needs to supply unique insights into the games, but he or she should have some sense of what’s happening.

3. I agree that it is a plus for the writer to be engaging and opinionated; this time, the analogy with conventional sportswriting seems to work very well. When we think of the best sportswriters, that’s what we get. And the Times should be aiming for the best. But I disagree with the statement that the writer should be “preferably a bit of a character.” Sometimes it’s ok to be a “character” (Liebling, Bill James, Bill Simmons), other times it’s not necessary (to the extent that Red Smith, say, was a “character,” it was through his writings, not through any personally outrageous behavior on his part).

I suspect that there are many chess writers who could do a good job at a weekly newspaper column—at least for awhile. Whether they could keep it up, week after week, for years, is another story. But maybe they could rotate through a few columnists and then keep whoever is still putting out good material a year later. Writing a column that is both accessible to patzers and interesting to experts, that’s not easy, but I think it can be done.

P.S. It’s hard to imagine that the Times will hire a chess columnist in any case. Why do I say this? Simple statistics. Gaffney’s article appeared in Slate on 14 Oct (I happened to come across it today via a link from a different piece in that magazine) and, as of 7 Dec, it’s received all of 6 comments. This suggests a stunning lack of interest in the topic! Really stunning. Actually, the number of comments is so low that it makes me wonder what’s going on. This was published in Slate, which gets about a zillion more hits than we do. And our posts on chess typically get lots more than 6 comments. So I have no idea what’s up here.

P.P.S. I posted this one on Sunday night when nobody is reading, because the topic is so damn unimportant! Still, it interests me; it relates to more general issues of communication, perhaps relevant when considering who becomes a newspaper science columnist. (I think a lot of these people have pretty advanced science knowledge but they’re not world-class scientists; in that way they’re like the chess bloggers who are solid, serious players but not grandmasters.)

Subtleties with measurement-error models for the evaluation of wacky claims

Paul Pudaite writes:

In the latest Journal of the American Statistical Association (September 2014, Vol. 109 No. 507), Andrew Harvey and Alessandra Luati published a paper [preprint here] — “Filtering With Heavy Tails” — featuring the phenomenon you had asked about (“…(non-Gaussian) models for which, as y gets larger, E(x|y) can actually go back toward zero.”).

See their Figure 1, p. 1113:

Screen Shot 2014-10-12 at 9.59.15 PM

They have a nice, brief discussion of the figure (also p. 1113):

The Gaussian response is the 45 degree line. For low degrees of freedom, observations that would be seen as outliers for a Gaussian distribution are far less influential. As |y|->Infinity, the response tends to zero.

They then note that:

Redescending M-estimators, which feature in the robustness literature, have the same property.

Unfortunately, they don’t provide any robustness literature references for redescending M-estimators. I guess it’s too well known!

(They do provide a reference for “the Huber M-estimator [which] has a Gaussian response until a certain point, whereupon it is constant”.)

The application in Harvey & Luati’s paper is railroad travel, so perhaps psychometricians still need to be alerted to these measurement error models.

I last posted on this topic in 2011.

Plaig: it’s not about the copying, it’s about the lack of attribution

I think most of you understand this one already but there still seems to be some confusion on how plagiarism works, so here goes . . .
Continue reading ‘Plaig: it’s not about the copying, it’s about the lack of attribution’ »