## Stacking and multiverse

It’s a coincidence that there is another multiverse posting today.

Recently Tim Disher asked in Stan discussion forum a question “Multiverse analysis – concatenating posteriors?”

Tim refers to a paper “Increasing Transparency Through a Multiverse Analysis” by Sara Steegen, Francis Tuerlinckx, Andrew Gelman, and Wolf Vanpaemel. The abstract says

Empirical research inevitably includes constructing a data set by processing raw data into a form ready for statistical analysis. Data processing often involves choices among several reasonable options for excluding, transforming, and coding data. We suggest that instead of performing only one analysis, researchers could perform a multiverse analysis, which involves performing all analyses across the whole set of alternatively processed data sets corresponding to a large set of reasonable scenarios. Using an example focusing on the effect of fertility on religiosity and political attitudes, we show that analyzing a single data set can be misleading and propose a multiverse analysis as an alternative practice. A multiverse analysis offers an idea of how much the conclusions change because of arbitrary choices in data construction and gives pointers as to which choices are most consequential in the fragility of the result.

In that paper the focus is in looking at the possible results from the multiverse of forking paths, but Tim asked whether it would “make sense at all to combine the posteriors from a multiverse analysis in a similar way to how we would combine multiple datasets in multiple imputation”?

• in multiple imputation the different data sets are posterior draws from the missing data distribution and thus usually equally weighted
• I think multiverse analysis is similar to case of having a set of models with different variables, variable transformations, interactions and non-linearities like in our Stacking paper (Yao, Vehtari, Simpson, Gelman), where we have different models for arsenic well data (section 4.6). Then stacking would be sensible way to combine *predictions* (as we may have different model parameters for differently processed data) with non-equal weights. Stacking is a good choice for model combination here as
1. we don’t need to assign prior probabilities for different forking paths
2. stacking favors paths which give good predictions
3. it avoids “prior dilutation problem” if some processed datasets happen to be very similar with each other (see fig 2c in Stacking paper)