Skip to content
Archive of posts filed under the Bayesian Statistics category.

Analyze all your comparisons. That’s better than looking at the max difference and trying to do a multiple comparisons correction.

[cat picture] The following email came in: I’m in a PhD program (poli sci) with a heavy emphasis on methods. One thing that my statistics courses emphasize, but that doesn’t get much attention in my poli sci courses, is the problem of simultaneous inferences. This strikes me as a problem. I am a bit unclear […]

Not everyone’s aware of falsificationist Bayes

Stephen Martin writes: Daniel Lakens recently blogged about philosophies of science and how they relate to statistical philosophies. I thought it may be of interest to you. In particular, this statement: From a scientific realism perspective, Bayes Factors or Bayesian posteriors do not provide an answer to the main question of interest, which is the […]

Breaking the dataset into little pieces and putting it back together again

Alex Konkel writes: I was a little surprised that your blog post with the three smaller studies versus one larger study question received so many comments, and also that so many people seemed to come down on the side of three smaller studies. I understand that Stephen’s framing led to some confusion as well as […]

Don’t say “improper prior.” Say “non-generative model.”

[cat picture] In Bayesian Data Analysis, we write, “In general, we call a prior density p(θ) proper if it does not depend on data and integrates to 1.” This was a step forward from the usual understanding which is that a prior density is improper if an infinite integral. But I’m not so thrilled with […]

Ride a Crooked Mile

Joachim Krueger writes: As many of us rely (in part) on p values when trying to make sense of the data, I am sending a link to a paper Patrick Heck and I published in Frontiers in Psychology. The goal of this work is not to fan the flames of the already overheated debate, but […]

Statistical Challenges of Survey Sampling and Big Data (my remote talk in Bologna this Thurs, 15 June, 4:15pm)

Statistical Challenges of Survey Sampling and Big Data Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University, New York Big Data need Big Model. Big Data are typically convenience samples, not random samples; observational comparisons, not controlled experiments; available data, not measurements designed for a particular study. As a result, it is […]

PhD student fellowship opportunity! in Belgium! to work with us! on the multiverse and other projects on improving the reproducibility of psychological research!!!

[image of Jip and Janneke dancing with a cat] Wolf Vanpaemel and Francis Tuerlinckx write: We at the Quantitative Psychology and Individual Differences, KU Leuven, Belgium are looking for a PhD candidate. The goal of the PhD research is to develop and apply novel methodologies to increase the reproducibility of psychological science. More information can […]

UK election summary

The Conservative party, led by Theresa May, defeated the Labour party, led by Jeremy Corbyn. The Conservative party got 42% of the vote, Labour got 40% of the vote, and all the other parties received 18% between them. The Conservatives ended up with 51.5% of the two-party vote, just a bit less than Hillary Clinton’s […]

The Publicity Factory: How even serious research gets exaggerated by the process of scientific publication and media exposure

The starting point is that we’ve seen a lot of talk about frivolous science, headline-bait such as the study that said that married women are more likely to vote for Mitt Romney when ovulating, or the study that said that girl-named hurricanes are more deadly than boy-named hurricanes, and at this point some of these […]

U.K. news article congratulates YouGov on using modern methods in polling inference

Mike Betancourt pointed me to this news article by Alan Travis that is refreshingly positive regarding the use of sophisticated statistical methods in analyzing opinion polls. Here’s Travis: Leading pollsters have described YouGov’s “shock poll” predicting a hung parliament on 8 June as “brave” and the decision by the Times to splash it on its […]

Another serious error in my published work!

Uh oh, I’m starting to feel like that pizzagate guy . . . Here’s the background. When I talk about my serious published errors, I talk about my false theorem, I talk about my empirical analysis that was invalidated by miscoded data, I talk my election maps whose flaws were pointed out by an angry […]

Come to Seattle to work with us on Stan!

Our colleague Jon Wakefield in the Department of Biostatistics at the University of Washington is interested in supervising a 2-year postdoc through this training program. We’re interested in finding someone who would with Jon and another faculty member (who is assigned on the basis of interests) on exciting projects in spatio-temporal modeling and the environmental […]

Static sensitivity analysis

After this discussion, I pointed Ryan Giordano, Tamara Broderick, and Michael Jordan to Figure 4 of this paper with Bois and Jiang as an example of “static sensitivity analysis.” I’ve never really followed up on this idea but I think it could be useful for many problems. Giordano replied: Here’s a copy of Basu’s robustness […]

Visualizing your fitted Stan model using ShinyStan without interfering with your Rstudio session

ShinyStan is great, but I don’t always use it because when you call it from R, it freezes up your R session until you close the ShinyStan window. But it turns out that it doesn’t have to be that way. Imad explains: You can open up a new session via the RStudio menu bar (Session […]

The Other Side of the Night

Don Green points us to this quantitative/qualitative meta-analysis he did with Betsy Levy Paluck and Seth Green. The paper begins: This paper evaluates the state of contact hypothesis research from a policy perspective. Building on Pettigrew and Tropp’s (2006) influential meta-analysis, we assemble all intergroup contact studies that feature random assignment and delayed outcome measures, […]

Some natural solutions to the p-value communication problem—and why they won’t work.

John Carlin and I write: It is well known that even experienced scientists routinely misinterpret p-values in all sorts of ways, including confusion of statistical and practical significance, treating non-rejection as acceptance of the null hypothesis, and interpreting the p-value as some sort of replication probability or as the posterior probability that the null hypothesis […]

A continuous hinge function for statistical modeling

This comes up sometimes in my applied work: I want a continuous “hinge function,” something like the red curve above, connecting two straight lines in a smooth way. Why not include the sharp corner (in this case, the function y=-0.5*x if x<0 or y=0.2*x if x>0)? Two reasons. First, computation: Hamiltonian Monte Carlo can trip […]

Causal inference using Bayesian additive regression trees: some questions and answers

[cat picture] Rachael Meager writes: We’re working on a policy analysis project. Last year we spoke about individual treatment effects, which is the direction we want to go in. At the time you suggested BART [Bayesian additive regression trees; these are not averages of tree models as are usually set up; rather, the key is […]

Using Stan for week-by-week updating of estimated soccer team abilites

Milad Kharratzadeh shares this analysis of the English Premier League during last year’s famous season. He fit a Bayesian model using Stan, and the R markdown file is here. The analysis has three interesting features: 1. Team ability is allowed to continuously vary throughout the season; thus, once the season is over, you can see […]

Splines in Stan! (including priors that enforce smoothness)

Milad Kharratzadeh shares a new case study. This could be useful to a lot of people. And here’s the markdown file with every last bit of R and Stan code. Just for example, here’s the last section of the document, which shows how to simulate the data and fit the model graphed above: Location of […]