How best to compare effects measured in two different time periods?

I received the following email from someone who wishes to remain anonymous:

My colleague and I are trying to understand the best way to approach a problem involving measuring a group of individuals’ abilities across time, and are hoping you can offer some guidance.

We are trying to analyze the combined effect of two distinct groups of people (A and B, with no overlap between A and B) who collaborate to produce a binary outcome, using a mixed logistic regression along the lines of the following.

Outcome ~ (1 | A) + (1 | B) + Other variables

What we’re interested in testing was whether the observed A random effects in period 1 are predictive of the A random effects in the following period 2. Our idea being create two models, each using a different period’s worth of data, to create two sets of A coefficients, then observe the relationship between the two. If the A’s have a persistent ability across periods, the coefficients should be correlated or show a linear-ish relationship. We can ignore learning effects.

There’s some difference on opinion on the best way to proceed, but neither of us are knowledgeable enough in this area. More specifically, should we compare

(a) Period 1 A random effects estimates vs. Period 2 B random effects estimates, or
(b) Period 1 A random effects estimates vs. Period 2 B fixed effects estimates

Option (a) uses the same type of model for each sample, seemingly making comparison of effects across periods more direct. And if balancing bias and variance in one sample is appropriate, it should be for the other as well. One can also look at the variance parameter for both periods to see if the population has changed characteristics.

Option (b) uses an unbiased estimates for period 2, with the argument that (a) introduces bias in both samples that will artificially increase the correlation between the effects across the samples, making it appear to be a stronger relationship than truly exists.

I have consulted your book, Data Analysis Using Regression and Multilevel/Hierarchical Models, but did not see this particular issue addressed (or perhaps, it is in there, but didn’t understand it for what it was).

My reply: Best would be to fit a varying-intercept, varying-slope model, where the intercept is the avg coef and the slope is the difference between time 1 and time 2. Thus,

Outcome ~ (1 + Period | A) + (1 + Period | B) + Period * Other variables,

where Period is coded as -0.5 for period 1 and +0.5 for period 2.

Then you can look at the sizes of the slopes for Period for the different people, or for a quick summary the standard deviations of the slopes.

I recommend fitting the model using blmer (better than lmer because it avoids zero group-level variance estimates and degenerate group-level covariance matrix estimates) and displaying using display(). Just use the “arm” and “blme” packages in R.

3 thoughts on “How best to compare effects measured in two different time periods?

  1. If the correlation of group performance at two time periods is the primary outcome of interest, wouldn’t the idea be to model this relationship explicitly?

    The approach I am thinking of is to assume the random effects are drawn from a bivariate normal distribution, with hyperpriors for the std deviation of each random effects distribution but with an additional hyperprior for the correlation of the period 1 and period 2 random effects. The posterior distribution of this correlation coefficient would be the quantity they are trying to estimate, no? Having said that, I can see why they might want a solution that can be operationalized with existing routines (though I assume this would not be too difficult in Stan). One potential issue I can see with the original solution is that the investigators have to make a decision about what std deviation of the slope estimates is consistent with durable group effects (or however they are conceptualizing their effect of interest).

  2. Pingback: Some things I’ve read: How best to compare effects measured in two different time periods? | ayeimanol_every meal counts

  3. 1. Do you have any idea how to accommodate more than two time points (periods) in this type of model?
    2. Is there any reason for you to pursue +.5/-.5 coding for period instead of a model traditional 1/0 coding?
    Thank you for anyone who will share your opinions about this in advance~

Comments are closed.