Greg Won writes: I manage a team tasked with, among other things, analyzing data on Air Traffic operations to identify factors that may be associated with elevated risk. I think its fair to characterize our work as “data mining” (e.g., using rule induction, Bayesian, and statistical methods). One of my colleagues sent me a link […]

**Bayesian Statistics**category.

## Avoiding model selection in Bayesian social research

One of my favorites, from 1995. Don Rubin and I argue with Adrian Raftery. Here’s how we begin: Raftery’s paper addresses two important problems in the statistical analysis of social science data: (1) choosing an appropriate model when so much data are available that standard P-values reject all parsimonious models; and (2) making estimates and […]

## Dave Blei course on Foundations of Graphical Models

Dave Blei writes: This course is cross listed in Computer Science and Statistics at Columbia University. It is a PhD level course about applied probabilistic modeling. Loosely, it will be similar to this course. Students should have some background in probability, college-level mathematics (calculus, linear algebra), and be comfortable with computer programming. The course is […]

## Discussion of “Maximum entropy and the nearly black object”

From 1992. It’s a discussion of a paper by Donoho, Johnstone, Hoch, and Stern. As I summarize: Under the “nearly black” model, the normal prior is terrible, the entropy prior is better and the exponential prior is slightly better still. (An even better prior distribution for the nearly black model would combine the threshold and […]

## “A hard case for Mister P”

Kevin Van Horn sent me an email with the above title (ok, he wrote MRP, but it’s the same idea) and the following content: I’m working on a problem that at first seemed like a clear case where multilevel modeling would be useful. As I’ve dug into it I’ve found that it doesn’t quite fit […]

## My courses this fall at Columbia

Stat 6103, Bayesian Data Analysis, TuTh 1-2:30 in room 428 Pupin Hall: We’ll be going through the book, section by section. Follow the link to see slides and lecture notes from when I taught this course a couple years ago. This course has a serious workload: each week we have three homework problems, one theoretical, […]

## Discussion with Sander Greenland on posterior predictive checks

Sander Greenland is a leading epidemiologist and educator who’s strongly influenced my thinking on hierarchical models by pointing out that often the data do not supply much information for estimating the group-level variance, a problem that can be particularly severe when the number of groups is low. (And, in some sense, the number of groups […]

## Estimated effect of early childhood intervention downgraded from 42% to 25%

Last year I came across an article, “Labor Market Returns to Early Childhood Stimulation: a 20-year Followup to an Experimental Intervention in Jamaica,” by Paul Gertler, James Heckman, Rodrigo Pinto, Arianna Zanolini, Christel Vermeerch, Susan Walker, Susan M. Chang, and Sally Grantham-McGregor, that claimed that early childhood stimulation raised adult earnings by 42%. At the […]

## SciLua 2 includes NUTS

The most recent release of SciLua includes an implementation of Matt’s sampler, NUTS (link is to the final JMLR paper, which is a revision of the earlier arXiv version). According to the author of SciLua, Stefano Peluchetti: Should be quite similar to your [Stan’s] implementation with some differences in the adaptation strategy. If you have […]

## Stan World Cup update

The other day I fit a simple model to estimate team abilities from World Cup outcomes. I fit the model to the signed square roots of the score differentials, using the square root on the theory that when the game is less close, it becomes more variable. 0. Background As you might recall, the estimated […]