Jay Jones writes:
I recently came across your paper on average predictive comparisons (Gelman and Pardoe, 2007) and can see many applications for this in my work (I’m an applied statistician working for Weyerhaeuser Company at our R&D center near Seattle). At the moment, I am using APC’s to help describe the results of a hierarchical multi-species model we fit to bird occupancy (presence/absence) data collected in the Oregon Coast Range.
A question that came up in our study led me to consider whether the APC framework can be used for post-hoc combinations of inputs. For example, let’s say that after calculating the APC for each individual input in our model, we would like to look at some linear function f of two inputs of interest, u1 and u2. Naively, I would like to be able to plug this into the APC framework. For example, equation 5 in your paper might look something like this (for brevity, I’m omitting the summations):
Numerator: w_ij * (E(y|u1_j, u2_j, v_i, theta) – E(y|u1_i, u2_i, v_i, theta)) * sign(f(u1_j, u2_j) – f(u1_i, u2_i))
Denominator: w_ij * ( f(u1_j, u2_j) – f(u1_i, u2_i) ) * sign( f(u1_j, u2_j) – f(u1_i, u2_i) )
In my specific study, f is just the sum of continuous inputs of interest (different types of forest cover). I would interpret this as follows: Our model says that the association between y and sum(u1, u2) depends on both values of u1 and u2, and the above APC estimate is intended to average over the distribution of “paths” to sum(u1, u2) in our dataset to get an average association with the sum of u1+u2.
My questions are – Is such an approach valid within the APC framework? Do you see any obvious (technical) issues that this naïve approach ignores? Would this be expected to work for a general function f with multiple inputs of interest?
My reply: Yes, this seems reasonable. The key steps seem to me to be:
1. Be clear on what the comparisons are.
2. Average over the v’s.
3. Average over uncertainty in the theta’s.