My oldest unpublished paper dates from my sophomore year in college. I can’t remember the title or all the details, but it was a solution to a differential-difference equation. The story of how it came about is here. A couple years after figuring out the proof, I wrote it up and submitted it to a journal but it was rejected, in retrospect no surprise, as I wrote the paper all by myself and had no idea of the appropriate style for a research paper. Actually, at the time I wrote it up, I don’t think I’d ever read a research article in any field! I wonder what happened to that article . . . if I ever find it, I’ll scan it in and post it here.

My undergraduate thesis, on the errors of Robert Axelrod’s application of game theory to trench warfare, was unpublished in any form for a long time. About 10 or 15 years after graduating, I rewrote the most interesting parts of the thesis in article form and submitted it to a political science journal; it was returned with lots of comments and I didn’t do anything more with it. Then awhile later I turned it into a chapter in my edited book with Jeronimo Cortina, A Quantitative Tour of the Social Sciences, and then a few years after *that*, I was contacted by a journal that wanted to publish it, which I did, under the title, Methodology as Ideology.

My next unpublished paper was from 1986, it was from my research on zone-melt-reconstruction of silicon; backstory here. I liked that one, and by then I’d read a bunch of research papers (and written a few), so I had a sense of how to get the results down on paper. I never finished the article, though, and I must have lost my copy of it.

My Ph.D. thesis, Topics in Image Reconstruction from Emission Tomography, from 1990, was never published. But lots of people asked me for copies of it; I must have sent out 100 or so. Parts of it went into various published articles, but most of it just served as an education for me.

Next one was from 1991: it was a Bayesian version of the iterative proportional fitting algorithm, using Gibbs sampling, or something like it, to draw from the posterior distribution in a model for contingency tables. This one even got some cites, I think. We never submitted it to a journal, though, because ultimately I wasn’t really happy with the model, which had no structure.

Hmmm, what else? There was a paper from 2003 that I’m a coauthor on, but which I think is just horrible so I removed it from my C.V. It was from a project where I served as statistical consultant, and I wasn’t happy with what was done, which was my fault as much as anyone else’s: had I insisted on something different/better, we probably could’ve done it, but I was just too lazy. That one’s not unpublished, but I’ve done my best to unpublish it, as it were.

After that, I have this list of 24 papers. Many of these, especially the more recent ones, don’t really count, as they’re submitted to journals and in some form will surely get published somewhere. Others on this list have already been published in abridged form but I’ve kept the original, longer versions on the website.

Here are the papers from that list which are unpublished in article form and will probably stay that way:

Fully Bayesian computing (with Jouni Kerman, from 2004). This is from Jouni’s thesis, and we introduce what is now called probabilistic programming. Now that everybody knows about probabilistic programming it’s not clear that there’d be any reason to publish this one.

Sampling for Bayesian computation with large datasets (with Zaiying Huang, from 2005). Our first divide-and-conquer algorithm. I never tried to publish this paper because the speed improvements from parallelization were so underwhelming in our example. It influenced our later work on expectation propagation, and we continue to cite this unpublished paper from 2005.

Moderation in the pursuit of moderation is no vice: the clear but limited advantages to being a moderate for Congressional elections (with Jonathan Katz, from 2007). I like this paper, and it’s mostly done, but we never bothered to get it into final shape. I used much of it in one of the chapters in Red State Blue State, which, in turn, has lots of research material that could’ve been made into articles, had we chosen to do so. (Back when Deb Nolan and I wrote our first edition of Teaching Statistics: A Bag of Tricks, I realized we had lots of publishable material so we quickly extracted about 10 articles from that book and published them in different places. But by the time I was writing Red State Blue State, six years later, I’d lost the motivation to churn out articles in that way. Not that I think there’s anything wrong with churning out articles: it’s a way to reach different audiences that might never otherwise see that material.)

One vote, many Mexicos: Income and vote choice in the 1994, 2000, and 2006 presidential elections (with Jeronimo Cortina and Naryana Lasala, from 2008). A spinoff of Red State Blue State, we used some of it in chapter 7 of that book. We submitted the paper to journals and revised it a few times; maybe it will appear at some point, I’m not sure.

Thoughts on new statistical procedures for age-period-cohort analyses (from 2008). This one was really annoying! The editor of the American Journal of Sociology invited me to write this as a comment on a paper to appear in their journal. I wrote this article, which I really like, and then the journal told me they didn’t want it. I never felt like taking the trouble to turn this into a stand-alone article. But it did help Yair and me set up our paper on the Great Society, Reagan’s revolution, and generations of presidential voting, which I’m sure will appear in a journal some day.

Visualizing distributions of covariance matrices (with Tomoki Tokuda, Ben Goodrich, Iven Van Mechelen, and Francis Tuerlinckx, from 2011, maybe?). I’m not actually sure why this never got published, as the paper is crisp and clean, with some good ideas. I guess it was nobody’s #1 priority, so once we got it rejected by a couple journals, we just let it slide.

Why ask why? Forward causal inference and reverse causal questions (with Guido Imbens, from 2013). I loooove this paper. I can’t remember if we ever submitted it anywhere. Guido and I talked about with Avi Feller, and the consensus was that we’d need to do more literature review to get it acceptable to a journal. We made some plans but then never bothered to go through with it. Again, nobody’s first priority. I incorporated it into one of the causal inference chapters in my upcoming book with Jennifer, so maybe it will reach people in that way.

The problem with p-values is how they’re used (from 2013). Funny story about this one: the journal Ecology solicited it as a comment on a paper they were publishing. Then in the production process, I found out they wanted to charge me, I think it was $300. Huh? They asked me to write the article for them, I wrote it for free, and then they wanted to charge *me* $300?? It turned out this fee was in the fine print all along. I couldn’t believe it. So I said, just forget about it. Nothing I can do with the article, so I just kept it on the unpublished papers site.

Causal inference with small samples and incomplete baseline for the Millennium Villages Project (with Shira Mitchell, Rebecca Ross, Susanna Makela, Elizabeth Stuart, Avi Feller, and Alan Zaslavsky). This one has lots of good stuff; I have no idea if it was ever submitted to a journal in this form or if it was just cannibalized for other papers.

NO TRUMP!: A statistical exercise in priming (with Jonathan Falk, from 2016). An amusing parody, no chance of getting published. I guess I could’ve submitted to Arxiv on April 1, but that’s not a journal publication either. My earlier paper on zombies did get published, actually, in a sampler of modern writing for undergraduates! So all things are possible, I guess.

Attitudes toward amalgamating evidence in statistics (with Keith O’Rourke, from 2016). Someone invited me to write this for some journal, I forget which, and then Keith and I wrote this fine little piece, and the journal rejected it! How annoying. Their call, though. I don’t know what we’ll do with it.

I suppose my collaborators and I have many more unpublished articles to come. But I expect none of them will rival any of the articles in this list of the greatest works of statistics never published.

I’d be interested in your paper on visualizing covariance matrices. Sounds like a useful technique.

Statsgirl:

They’re all at the link above (repeated here for convenience).

You (they?) should at least put that one up on arXiv so it’s citeable (at least in some circles). It’s a great paper and I refer people to it all the time. Everyone should know about the LKJ prior (Lewandowski, Kurowicka, and Joe came up with it as a method for generating random correlation matrices and Ben Goodrich realized its potential as a prior and implemented it in Stan). Stan frees users from the tyranny of conjugate priors (you could only have Wishart or inverse Wishart priors on covariance matrices in BUGS and JAGS, but through the magic of unconstraining transforms, inverses, and Jacobians, Stan lets you use any old prior you can define in code).

Ah, you have “published” your “unpublished” papers. Enjoying the graphs :)

You cited my graphs on zombies, so I recently was asked to release them for use in that undergrad book on writing. I thought it was amusing.

I didn’t realize that The Garden of Forking Paths was unpublished. This was the first paper of yours I read, on the recommendation of a fellow graduate student who over lunch decided he was going to convince me to abandon using p-values. It made a big impression – at the time I knew to keep my eye out for “multiple comparisons” problems, but I had never considered that interpreting a p-value in the traditional sense requires us to make implicit (and typically false) claims about what would have been done had the data been different.

Ben:

The article was published in American Scientist; see here. But they insisted on removing the phrase “garden of forking paths” from the title and they introduced an embarrassing error in the very first paragraph (my bad for not catching it), so I prefer to point people to the preprint version.

You’ve forgotten this one:

Back in about, hmm, 1999 or so, you and I wrote a draft of a paper summarizing all of the stuff from the LBNL ‘High-Radon Project’, mostly recapitulating stuff that was already published piecemeal. A senior colleague asked us not to submit it yet, and even suggested that we quit working on it, because he was working on his own summary and would we please consider merging our stuff with his rather than publishing separately? I checked with him every few months for a couple of years but he never finished his draft but also never agreed that he wasn’t going to. In the end you and I decided to drop it. I think this was fine, the important stuff had already been published and the political and practical issues involved with monitoring and remediating high residential radon concentrations had passed beyond the influence of science.

Phil,

Sure, but I’m only talking about unpublished papers here, not unfinished papers. If I were to count unfinished papers, I’d have a list of hundreds!

I have a few of these. In undergrad I wrote a paper about why 987654321/123456789 is so close to 8. I sent it to a journal without knowing anything about academic writing and it got rejected in three minutes.