Donny Williams sends along this paper, with Philippe Rast and Paul-Christian Bürkner, and writes:

This paper is similar to the Chung et al. avoiding boundary estimates papers (here and here), but we use fully Bayesian methods, and specifically the half-Cauchy prior. We show it has as good of performance as a fully informed prior based on tau values in psychology.

Further, we consider KL-divergence between estimated meta-analytic distribution and the “true” meta-analytic distribution. Here we show a striking advantage for the Bayesian models, which has never been shown in the context of meta-analysis.

Cool! I love to see our ideas making a difference.

And here’s the abstract to the Williams, Rast, and Bürkner paper:

Developing meta-analytic methods is an important goal for psychological science. When there are few studies in particular, commonly used methods have several limitations, most notably of which is underestimating between-study variability. Although Bayesian methods are often recommended for small sample situations, their performance has not been thoroughly examined in the context of meta-analysis. Here, we characterize and apply weakly-informative priors for estimating meta-analytic models and demonstrate with extensive simulations that fully Bayesian methods overcome boundary estimates of exactly zero between-study variance, be er maintain error rates, and have lower frequentist risk according to Kullback-Leibler divergence. While our results show that combining evidence with few studies is non-trivial, we argue that this is an important goal that deserves further consideration in psychology. Further, we suggest that frequentist properties can provide important information for Bayesian modeling. We conclude with meta-analytic guidelines for applied researchers that can be implemented with the provided computer code.

I completely agree with this remark: “frequentist properties can provide important information for Bayesian modeling.”

In addition to simulations, I would like to see them evaluate the small sample performance of the method by applying it to many small subsamples of studies in an area where a large number of studies has been performed and seeing how the posteriors of between study variability compare to the estimated between study variability from the large sample.

The paper looks interesting from a quick scan and I really like the clarification about the usual properties they evaluated not being their preferences but rather what the community expects.

And it’s going in what I think is a more purposeful direction than the Rice paper I had earlier criticized. “I believe a better route forward, if there is knowledge of what to post-stratify on to get a stable over time and place population of interest would be MRP with an informative prior on effect variation. If not just a random effect model with an informative prior on effect variation.” https://andrewgelman.com/2017/10/05/missing-will-paper-likely-lead-researchers-think/ and https://andrewgelman.com/2017/11/01/missed-fixed-effects-plural/

Using a collection of studies in an area to get an empirically informed prior for effect variation was discussed by me and others in the 1990s but it took an awful long time for it to start happening. I have been out of the meta-analysis area for a while and like some mobster said “I keep trying to get out but I always get pulled back in.” But for those that are interested Kenneth Rice will giving a talk at JSM2018 http://ww2.amstat.org/meetings/jsm/2018/onlineprogram/ActivityDetails.cfm?SessionID=214991

Now, of course comes the possible criticism (again not having read the paper carefully but recalling an email exchange with Ingram Olkin about meta-analysis of effect measures when the assessments used different instruments). My email search is not working this morning – but essentially the concern is that with effect measures based on different instruments its hard to be convinced that likelihoods can actually be discerned or that the parameters would reflect the same things in different studies. My first rule of meta-analysis is to discern what likely will be common. It was this concern that lead (and I believe Ingram) to simply combine p values whereas everywhere else I would use likelihood (and Bayesian if I was permitted to add a prior).

When I find the emails, I will post the one I last sent to Ingram (he did not respond) here.

I remembered it being a lot longer – but if it’s and interest here it is:

Ingram:

I had to leave your talk early but I really enjoyed it!

When I first read your book (after I had published the L’Abbe paper) I noticed that you were addressing something other than “essentially similar repeated experiments” that I had taken as my task in the randomised clinical trial research area where the diagnosis, treatment and outcome (usually mortality) seemed identical. Because of that, I continued with the Fisher/Cochran/Yates approach based on repeated agricultural trials.

Not so sure now about the “identicalness” in RCTs now, but I did not realize that you where taking meta-analysis as meaning where this “identicalness” was hopeless wishing and for instance trying to get defensible likelihoods was silly.

Now your approach makes much more senses to me – thanks.