The other day we had a fun little discussion in the comments section of the sister blog about the appropriateness of stating forecast probabilities to the nearest tenth of a percentage point. It started when Josh Tucker posted this graph from Nate Silver: My first reaction was: this looks pretty but it’s hyper-precise. I’m a [...]
“Intrade to the 57th power”
David Pennock writes: http://PredictWiseQ.com is our (beta) prediction contest which aims to estimate not just the marginal probabilities of election outcomes this November, but millions of correlations among outcomes as well, like the chance Obama will win both Ohio and Florida, or the chance Romney will win if the September jobs numbers are negative. It’s [...]
Bayesian analogue to stepwise regression?
Bill Harris writes: On pp. 250-251 of BDA second edition, you write about multiple comparisons, and you write about stepwise regression on p. 405. How would you look at stepwise regression analyses in light of the multiple comparisons problem? Is there an issue? My reply: In this case I think the right approach is to [...]
Bayesian brains?
Psychology researcher Alison Gopnik discusses the idea that some of the systematic problems with human reasoning can be explained by systematic flaws in the statistical models we implicitly use. I really like this idea and I’ll return to it in a bit. But first I need to discuss a minor (but, I think, ultimately crucial) [...]
Columbo does posterior predictive checks
I’m already on record as saying that Ronald Reagan was a statistician so I think this is ok too . . . Here’s what Columbo does. He hears the killer’s story and he takes it very seriously (it’s murder, and Columbo never jokes about murder), examines all its implications, and finds where it doesn’t fit [...]
Fighting a losing battle
Following a recent email exchange regarding path sampling and thermodynamic integration (sadly, I’ve gotten rusty and haven’t thought seriously about these challenges for many years), a correspondent referred to the marginal distribution of the data under a model as “the evidence.” I hate that expression! As we discuss in chapter 6 of BDA, for continuous-parametered [...]
Computational problems with glm etc.
John Mount provides some useful background and follow-up on our discussion from last year on computational instability of the usual logistic regression solver. Just to refresh your memory, here’s a simple logistic regression with only a constant term and no separation, nothing pathological at all: > y display (glm (y ~ 1, family=binomial(link=”logit”))) glm(formula = [...]
Estimating seasonality with a data set that’s just 52 weeks long
Kaiser asks: Trying to figure out what are some keywords to research for this problem I’m trying to solve. I need to estimate seasonality but without historical data. What I have are multiple time series of correlated metrics (think department store sales, movie receipts, etc.) but all of them for 52 weeks only. I’m thinking [...]
Incoherence of Bayesian data analysis
Hogg writes: At the end this article you wonder about consistency. Have you ever considered the possibility that utility might resolve some of the problems? I have no idea if it would—I am not advocating that position—I just get some kind of intuition from phrases like “Judgment is required to decide…”. Perhaps there is a [...]
Uri Simonsohn is speaking at Columbia tomorrow (Mon)
Noon in the stat dept (room 903 School of Social Work, at 122/Amsterdam). He’ll be talking about ways of finding fishy p-values. See here and here for background. This stuff is cool and important.
Our blog makes connections!
Steve Cohen writes: Thank you for fulfilling another request of mine almost two years ago. I gave you a job description of a senior Bayesian statistical software developer that I was loooking to hire. You kindly posted it on your site. THAT EVENING, I received a response from a fellow in Florida who had worked [...]
Using the “instrumental variables” or “potential outcomes” approach to clarify causal thinking
As I’ve written here many times, my experiences in social science and public health research have left me skeptical of statistical methods that hypothesize or try to detect zero relationships between observational data (see, for example, the discussion starting at the bottom of page 960 in my review of causal inference in the American Journal [...]
Commercial Bayesian inference software is popping up all over
Steve Cohen writes: As someone who has been working with Bayesian statistical models for the past several years, I [Cohen] have been challenged recently to describe the difference between Bayesian Networks (as implemented in BayesiaLab software) and modeling and inference using MCMC methods. I hope you have the time to give me (or to write [...]
Prior distributions for regression coefficients
Eric Brown writes: I have come across a number of recommendations over the years about best practices for multilevel regression modeling. For example, the use of t-distributed priors for coefficients in logistic regression and standardizing input variables from one of your 2008 Annals of Applied Statistics papers; or recommendations for priors on variance parameters from [...]
Model checking and model understanding in machine learning
Last month I wrote: Computer scientists are often brilliant but they can be unfamiliar with what is done in the worlds of data collection and analysis. This goes the other way too: statisticians such as myself can look pretty awkward, reinventing (or failing to reinvent) various wheels when we write computer programs or, even worse, [...]
Stan is fast
10,000 iterations for 4 chains on the (precompiled) efficiently-parameterized 8-schools model:
A Stan is Born
Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language [...]
Visualizing Distributions of Covariance Matrices
Since we’ve been discussing prior distributions on covariance matrices, I will recommend this recent article (coauthored with Tomoki Tokuda, Ben Goodrich, Iven Van Mechelen, and Francis Tuerlinckx) on their visualization: We present some methods for graphing distributions of covariance matrices and demonstrate them on several models, including the Wishart, inverse-Wishart, and scaled inverse-Wishart families in [...]
More on scaled-inverse Wishart and prior independence
I’ve had a couple of email conversations in the past couple days on dependence in multivariate prior distributions. Modeling the degrees of freedom and scale parameters in the t distribution First, in our Stan group we’ve been discussing the choice of priors for the degrees-of-freedom parameter in the t distribution. I wrote that also there’s [...]
Ways of knowing
In this discussion from last month, computer science student and Judea Pearl collaborator Elias Barenboim expressed an attitude that hierarchical Bayesian methods might be fine in practice but that they lack theory, that Bayesians can’t succeed in toy problems. I posted a P.S. there which might not have been noticed so I will put it [...]
Multilevel modeling and instrumental variables
Terence Teo writes: I was wondering if multilevel models can be used as an alternative to 2SLS or IV models to deal with (i) endogeneity and (ii) selection problems. More concretely, I am trying to assess the impact of investment treaties on foreign investment. Aside from the fact that foreign investment is correlated over time, [...]
The scaled inverse Wishart prior distribution for a covariance matrix in a hierarchical model
Since we’re talking about the scaled inverse Wishart . . . here’s a recent message from Chris Chatham: I have been reading your book on Bayesian Hierarchical/Multilevel Modeling but have been struggling a bit with deciding whether to model my multivariate normal distribution using the scaled inverse Wishart approach you advocate given the arguments at [...]
Standardizing regression inputs
Andy Flies, Ph.D. candidate in zoology, writes: After reading your paper about scaling regression inputs by two standard deviations I found your blog post stating that you wished you had scaled by 1 sd and coded the binary inputs as -1 and 1. Here is my question: If you code the binary input as -1 [...]
“Real data can be a pain”
Michael McLaughlin sent me the following query with the above title. Some time ago, I [McLaughlin] was handed a dataset that needed to be modeled. It was generated as follows: 1. Random navigation errors, historically a binary mixture of normal and Laplace with a common mean, were collected by observation. 2. Sadly, these data were [...]
How I think about mixture models
Larry Wasserman refers to finite mixture models as “beasts” and writes jokes that they “should be avoided at all costs.” I’ve thought a lot about mixture models, ever since using them in an analysis of voting patterns that was published in 1990. First off, I’d like to say that our model was useful so I’d [...]