Skip to content

Index or indicator variables

Someone who doesn’t want his name shared (for the perhaps reasonable reason that he’ll “one day not be confused, and would rather my confusion not live on online forever”) writes:

I’m exploring HLMs and stan, using your book with Jennifer Hill as my field guide to this new territory. I think I have a generally clear grasp on the material, but wanted to be sure I haven’t gone astray.

The problem in working on involves a multi-nation survey of students, and I’m especially interested in understanding the effects of country, religion, and sex, and the interactions among those factors (using IRT to estimate individual-level ability, then estimating individual, school, and country effects).

Following the basic approach laid out in chapter 13 for such interactions between levels, I think I need to create a matrix of indicator variables for religion and sex. Elsewhere in the book, you recommend against indicator variables in favor of a single index variable.

Am I right in thinking that this is purely a matter of convenience, and that the matrix formulation of chapter 13 requires indicator variables, but that the matrix of indicators or the vector of indices yield otherwise identical results? I can’t see why they shouldn’t be the same, but my intuition is still developing around multi-level models.

I replied:

Yes, models can be formulated equivalently in terms of index or indicator variables. If a discrete variable can take on a bunch of different possible values (for example, 50 states), it makes sense to use a multilevel model rather than to include indicators as predictors with unmodeled coefficients. If the variable takes on only two or three values, you can still do a multilevel model but really it would be better at that point to use informative priors for any variance parameters. That’s a tactic we do not discuss in our book but which is easy to implement in Stan, and I’m hoping to do more of it in the future.

To which my correspondent wrote:

The main difference that occurs to me as I work through implementing this is that the matrix of indicator variables loses information about what the underlying variable was. So, for instance, if the matrix mixes an indicator for sex and n indicators for religion and m indicators for schools, we’d have Sigma_beta be an m+n+1 x m+n+1 matrix, when we really want a 3×3 matrix.

I could set up the basic structure of Sigma_beta, separately estimate the diagonal elements with a series of multilevel loops by sex, religion, and school, and eschew the matrix formulation in the individual model. So instead of y~N(X_iB_j[i],sigma^2_y) it would be (roughly, I’m doing this on my phone):

y_i~N(beta_sex[i]+beta_sex_country[country[i]]+beta_religion[i]+beta_religion_country[i,country[i]]+beta_school[i]+beta_school_country[i,country[i]],sigma^2_y)

And the group-level formulation unchanged. Sigma_beta becomes a 3×3 matrix rather than an m+n+1 matrix, which seems both more reasonable and more computationally tractable.

My reply:

Now I’m getting tangled in your notation. I’m not sure what Sigma_beta is.

One-tailed or two-tailed?

two-tailed

Someone writes:

Suppose I have two groups of people, A and B, which differ on some characteristic of interest to me; and for each person I measure a single real-valued quantity X. I have a theory that group A has a higher mean value of X than group B. I test this theory by using a t-test. Am I entitled to use a *one-tailed* t-test? Or should I use a *two-tailed* one (thereby giving a p-value that is twice as large)?

I know you will probably answer: Forget the t-test; you should use Bayesian methods instead.

But what is the standard frequentist answer to this question?

My reply:

The quick answer here is that different people will do different things here. I would say the 2-tailed p-value is more standard but some people will insist on the one-tailed version, and it’s hard to make a big stand on this one, given all the other problems with p-values in practice:

http://www.stat.columbia.edu/~gelman/research/unpublished/p_hacking.pdf

http://www.stat.columbia.edu/~gelman/research/published/pvalues3.pdf

P.S. In the comments, Sameer Gauria summarizes a key point:

It’s inappropriate to view a low P value (indicating a misfit of the null hypothesis to data) as strong evidence in favor of a specific alternative hypothesis, rather than other, perhaps more scientifically plausible, alternatives.

This is so important. You can take lots and lots of examples (most notably, all those Psychological Science-type papers) with statistically significant p-values, and just say: Sure, the p-value is 0.03 or whatever. I agree that this is evidence against the null hypothesis, which in these settings typically has the following five aspects:
1. The relevant comparison or difference or effect in the population is exactly zero.
2. The sample is representative of the population.
3. The measurement in the data corresponds to the quantities of interest in the population.
4. The researchers looked at exactly one comparison.
5. The data coding and analysis would have been the same had the data been different.
But, as noted above, evidence against the null hypothesis is not, in general, strong evidence in favor of a specific alternative hypothesis, rather than other, perhaps more scientifically plausible, alternatives.

If you get to the point of asking, just do it. But some difficulties do arise . . .

Nelson Villoria writes:

I find the multilevel approach very useful for a problem I am dealing with, and I was wondering whether you could point me to some references about poolability tests for multilevel models. I am working with time series of cross sectional data and I want to test whether the data supports cross sectional and/or time pooling. In a standard panel data setting I do this with Chow tests and/or CUSUM. Are these ideas directly transferable to the multilevel setting?

My reply: I think you should do partial pooling. Once the question arises, just do it. Other models are just special cases. I don’t see the need for any test.

That said, if you do a group-level model, you need to consider including group-level averages of individual predictors (see here). And if the number of groups is small, there can be real gains from using an informative prior distribution on the hierarchical variance parameters. This is something that Jennifer and I do not discuss in our book, unfortunately.

Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials

Prakash Nayak writes:

I work as a musculoskeletal oncologist (surgeon) in Mumbai, India and am keen on sarcoma research.

Sarcomas are rare disorders, and conventional frequentist analysis falls short of providing meaningful results for clinical application.

I am thus keen on applying Bayesian analysis to a lot of trials performed with small numbers in this field.

I need advise from you for a good starting point for someone uninitiated in Bayesian analysis. What to read, what courses to take and is there a way I could collaborate with any local/international statisticians dealing with these methods.

I have attached a recent publication [Optimal timing of pulmonary metastasectomy – is a delayed operation beneficial or counterproductive?, by M. Kruger, J. D. Schmitto, B. Wiegmannn, T. K. Rajab, and A. Haverich] which is one amongst others I understand would benefit from some Bayesian analyses.

I have no idea who in India works in this area so I’m just putting this one out there in the hope that someone will be able to make the connection.

When you believe in things that you don’t understand

Stevie+Wonder+-+The+Woman+In+Red+-+LP+RECORD-523839

This would make Karl Popper cry. And, at the very end:

The present results indicate that under certain, theoretically predictable circumstances, female ovulation—long assumed to be hidden—is in fact associated with a distinct, objectively observable behavioral display.

This statement is correct—if you interpret the word “predictable” to mean “predictable after looking at your data.”

P.S. I’d like to say that April 15 is a good day for this posting because your tax dollars went toward supporting this research. But actually it was supported by the Social Sciences Research Council of Canada, and I assume they do their taxes on their own schedule.

P.P.S. In preemptive response to people who think I’m being mean by picking on these researchers, let me just say: Nobody forced them to publish these articles. If you put your ideas out there, you have to be ready for criticism.

Transitioning to Stan

Kevin Cartier writes:

I’ve been happily using R for a number of years now and recently came across Stan. Looks big and powerful, so I’d like to pick an appropriate project and try it out. I wondered if you could point me to a link or document that goes into the motivation for this tool (aside from the Stan user doc)? What I’d like to understand is, at what point might you look at an emergent R project and advise, “You know, that thing you’re trying to do would be a whole lot easier/simpler/more straightforward to implement with Stan.” (or words to that effect).

My reply: For my collaborators in political science, Stan has been most useful for models where the data set is not huge (e.g., we might have 10,000 data points or 50,000 data points but not 10 million) but where the model is somewhat complex (for example, a model with latent time series structure). The point is that the model has enough parameters and uncertainty that you’ll want to do full Bayes (rather than some sort of point estimate). At that point, Stan is a winner compared to programming one’s own Monte Carlo algorithm.

We (the Stan team) should really prepare a document with a bunch of examples where Stan is a win, in one way or another. But of course preparing such a document takes work, which we’d rather spend on improving Stan (or on blogging…)

On deck this week

Mon: Transitioning to Stan

Tues: When you believe in things that you don’t understand

Wed: Looking for Bayesian expertise in India, for the purpose of analysis of sarcoma trials

Thurs: If you get to the point of asking, just do it. But some difficulties do arise . . .

Fri: One-tailed or two-tailed?

Sat: Index or indicator variables

Sun: Fooled by randomness

“If you are primarily motivated to make money, you . . . certainly don’t want to let people know how confused you are by something, or how shallow your knowledge is in certain areas. You want to project an image of mastery and omniscience.”

A reader writes in:

This op-ed made me think of one your recent posts. Money quote:

If you are primarily motivated to make money, you just need to get as much information as you need to do your job. You don’t have time for deep dives into abstract matters. You certainly don’t want to let people know how confused you are by something, or how shallow your knowledge is in certain areas. You want to project an image of mastery and omniscience.

Continue reading ‘“If you are primarily motivated to make money, you . . . certainly don’t want to let people know how confused you are by something, or how shallow your knowledge is in certain areas. You want to project an image of mastery and omniscience.”’ »

“Schools of statistical thoughts are sometimes jokingly likened to religions. This analogy is not perfect—unlike religions, statistical methods have no supernatural content and make essentially no demands on our personal lives. Looking at the comparison from the other direction, it is possible to be agnostic, atheistic, or simply live one’s life without religion, but it is not really possible to do statistics without some philosophy.”

This bit is perhaps worth saying again, especially given the occasional trolling on the internet by people who disparage their ideological opponents by calling them “religious” . . . So here it is:

Sometimes the choice of statistical philosophy is decided by convention or convenience. . . . In many settings, however, we have freedom in deciding how to attack a problem statistically. How then do we decide how to proceed?

Schools of statistical thoughts are sometimes jokingly likened to religions. This analogy is not perfect—unlike religions, statistical methods have no supernatural content and make essentially no demands on our personal lives. Looking at the comparison from the other direction, it is possible to be agnostic, atheistic, or simply live one’s life without religion, but it is not really possible to do statistics without some philosophy. Even if you take a Tukeyesque stance and admit only data and data manipulations without reference to probability models, you still need some criteria to evaluate the methods that you choose.

One way in which schools of statistics are like religions is in how we end up affiliating with them. Based on informal observation, I would say that statis- ticians typically absorb the ambient philosophy of the institution where they are trained—or else, more rarely, they rebel against their training or pick up a philosophy later in their career or from some other source such as a persuasive book. Similarly, people in modern societies are free to choose their religious affiliation but it typically is the same as the religion of parents and extended family. Philosophy, like religion but not (in general) ethnicity, is something we are free to choose on our own, even if we do not usually take the opportunity to take that choice. Rather, it is common to exercise our free will in this setting by forming our own personal accommodation with the religion or philosophy bequeathed to us by our background.

For example, I affiliated as a Bayesian after studying with Don Rubin and, over the decades, have evolved my own philosophy using his as a starting point. I did not go completely willingly into the Bayesian fold—the first statistics course I took (before I came to Harvard) had a classical perspective, and in the first course I took with Don, I continued to try to frame all the inferential problems into a Neyman-Pearson framework. But it didn’t take me or my fellow students long to slip into comfortable conformity. . . .

Beliefs and affiliations are interesting and worth studying, going beyond simple analogies to religion.

P.S. See here for some similar thoughts from a few years ago. The key point is that a belief is not (necessarily) the same thing as a religion, and I don’t think it’s helpful for people to use “religion” as a generalized insult that is applied to beliefs that they disagree with.

“More research from the lunatic fringe”

A linguist send me an email with the above title and a link to a paper, “The Effect of Language on Economic Behavior: Evidence from Savings Rates, Health Behaviors, and Retirement Assets,” by M. Keith Chen, which begins:

Languages differ widely in the ways they encode time. I test the hypothesis that languages that grammatically associate the future and the present, foster future-oriented behavior. This prediction arises naturally when well-documented e§ects of language structure are merged with models of intertemporal choice. Empirically, I find that speakers of such languages: save more, retire with more wealth, smoke less, practice safer sex, and are less obese. This holds both across countries and within countries when comparing demographically similar native households. The evidence does not support the most obvious forms of common causation. I discuss implications for theories of intertemporal choice.

I ran this by another linguist who confirmed the “lunatic fringe” comment and pointed me to this post from Mark Liberman and this followup from Keith Chen. My friend also wrote:

I think it’d be well-nigh impossible to separate the effect of speaking West Greenlandic from living in West Greenland, or more reasonably, speaking Finnish from living in Finland. Who else speaks Finnish (maybe some Swedes?)

My reply:

B-b-but . . . the paper is scheduled to appear in the American Economic Review! Short of Science, Nature, and Psychological Science, that’s probably the most competitive and prestigious journal in the universe.

More seriously, this is an interesting case because I have no intuition about the substance of the matter (unlike various examples in psychology and political science). The theoretical microeconomic model in the paper seems ridiculous to me, that’s for sure, but I have no good way to think about the cross-country comparisons, one way or another.