Skip to content
Archive of posts filed under the Bayesian Statistics category.

3 things that will surprise you about model validation and calibration for state space models

Gurjinder Mohan writes: I was wondering if you had any advice specific to state space models when attempting model validation and calibration. I was planning on conducting a graphical posterior predictive check. I’d also recommend fake-data simulation. Beyond that, I’d need to know more about the example. I’m posting here because this seems like a […]

Statisticians and economists agree: We should learn from data by “generating and revising models, hypotheses, and data analyzed in response to surprising findings.” (That’s what Bayesian data analysis is all about.)

Kevin Lewis points us to this article by economist James Heckman and statistician Burton Singer, who write: All analysts approach data with preconceptions. The data never speak for themselves. Sometimes preconceptions are encoded in precise models. Sometimes they are just intuitions that analysts seek to confirm and solidify. A central question is how to revise […]

Bayesian, but not Bayesian enough

Will Moir writes: This short New York Times article on a study published in BMJ might be of interest to you and your blog community, both in terms of how the media reports science and also the use of bayesian vs frequentist statistics in the study itself. Here is the short summary from the news […]

Estimating Public Market Exposure of Private Capital Funds Using Bayesian Inference

I don’t know anything about this work by Luis O’Shea and Vishv Jeet—that is, I know nothing of public market exposure or private capital firms, and I don’t know anything about the model they fit, the data they used, or what information they had available for constructing and checking their model. But what I do […]

Analyze all your comparisons. That’s better than looking at the max difference and trying to do a multiple comparisons correction.

[cat picture] The following email came in: I’m in a PhD program (poli sci) with a heavy emphasis on methods. One thing that my statistics courses emphasize, but that doesn’t get much attention in my poli sci courses, is the problem of simultaneous inferences. This strikes me as a problem. I am a bit unclear […]

Not everyone’s aware of falsificationist Bayes

Stephen Martin writes: Daniel Lakens recently blogged about philosophies of science and how they relate to statistical philosophies. I thought it may be of interest to you. In particular, this statement: From a scientific realism perspective, Bayes Factors or Bayesian posteriors do not provide an answer to the main question of interest, which is the […]

Breaking the dataset into little pieces and putting it back together again

Alex Konkel writes: I was a little surprised that your blog post with the three smaller studies versus one larger study question received so many comments, and also that so many people seemed to come down on the side of three smaller studies. I understand that Stephen’s framing led to some confusion as well as […]

Don’t say “improper prior.” Say “non-generative model.”

[cat picture] In Bayesian Data Analysis, we write, “In general, we call a prior density p(θ) proper if it does not depend on data and integrates to 1.” This was a step forward from the usual understanding which is that a prior density is improper if an infinite integral. But I’m not so thrilled with […]

Ride a Crooked Mile

Joachim Krueger writes: As many of us rely (in part) on p values when trying to make sense of the data, I am sending a link to a paper Patrick Heck and I published in Frontiers in Psychology. The goal of this work is not to fan the flames of the already overheated debate, but […]

Statistical Challenges of Survey Sampling and Big Data (my remote talk in Bologna this Thurs, 15 June, 4:15pm)

Statistical Challenges of Survey Sampling and Big Data Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University, New York Big Data need Big Model. Big Data are typically convenience samples, not random samples; observational comparisons, not controlled experiments; available data, not measurements designed for a particular study. As a result, it is […]

PhD student fellowship opportunity! in Belgium! to work with us! on the multiverse and other projects on improving the reproducibility of psychological research!!!

[image of Jip and Janneke dancing with a cat] Wolf Vanpaemel and Francis Tuerlinckx write: We at the Quantitative Psychology and Individual Differences, KU Leuven, Belgium are looking for a PhD candidate. The goal of the PhD research is to develop and apply novel methodologies to increase the reproducibility of psychological science. More information can […]

UK election summary

The Conservative party, led by Theresa May, defeated the Labour party, led by Jeremy Corbyn. The Conservative party got 42% of the vote, Labour got 40% of the vote, and all the other parties received 18% between them. The Conservatives ended up with 51.5% of the two-party vote, just a bit less than Hillary Clinton’s […]

The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and media exposure

The starting point is that we’ve seen a lot of talk about frivolous science, headline-bait such as the study that said that married women are more likely to vote for Mitt Romney when ovulating, or the study that said that girl-named hurricanes are more deadly than boy-named hurricanes, and at this point some of these […]

U.K. news article congratulates YouGov on using modern methods in polling inference

Mike Betancourt pointed me to this news article by Alan Travis that is refreshingly positive regarding the use of sophisticated statistical methods in analyzing opinion polls. Here’s Travis: Leading pollsters have described YouGov’s “shock poll” predicting a hung parliament on 8 June as “brave” and the decision by the Times to splash it on its […]

Another serious error in my published work!

Uh oh, I’m starting to feel like that pizzagate guy . . . Here’s the background. When I talk about my serious published errors, I talk about my false theorem, I talk about my empirical analysis that was invalidated by miscoded data, I talk my election maps whose flaws were pointed out by an angry […]

Come to Seattle to work with us on Stan!

Our colleague Jon Wakefield in the Department of Biostatistics at the University of Washington is interested in supervising a 2-year postdoc through this training program. We’re interested in finding someone who would with Jon and another faculty member (who is assigned on the basis of interests) on exciting projects in spatio-temporal modeling and the environmental […]

Static sensitivity analysis

After this discussion, I pointed Ryan Giordano, Tamara Broderick, and Michael Jordan to Figure 4 of this paper with Bois and Jiang as an example of “static sensitivity analysis.” I’ve never really followed up on this idea but I think it could be useful for many problems. Giordano replied: Here’s a copy of Basu’s robustness […]

Visualizing your fitted Stan model using ShinyStan without interfering with your Rstudio session

ShinyStan is great, but I don’t always use it because when you call it from R, it freezes up your R session until you close the ShinyStan window. But it turns out that it doesn’t have to be that way. Imad explains: You can open up a new session via the RStudio menu bar (Session […]

The Other Side of the Night

Don Green points us to this quantitative/qualitative meta-analysis he did with Betsy Levy Paluck and Seth Green. The paper begins: This paper evaluates the state of contact hypothesis research from a policy perspective. Building on Pettigrew and Tropp’s (2006) influential meta-analysis, we assemble all intergroup contact studies that feature random assignment and delayed outcome measures, […]

Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis […]