After I gave my talk at an econ seminar on Why We (Usually) Don’t Care About Multiple Comparisons, I got the following comment:
One question that came up later was whether your argument is really with testing in general, rather than only with testing in multiple comparison settings.
Yes, my argument is with testing in general. But it arises with particular force in multiple comparisons. With a single test, we can just say we dislike testing so we use confidence intervals or Bayesian inference instead, and it’s no problem—really more of a change in emphasis than a change in methods. But with multiple tests, the classical advice is not simply to look at type 1 error rates but more specifically to make a multiplicity adjustment, for example to make confidence intervals wider to account for multiplicity. I don’t want to do this! So here there is a real battle to fight.
P.S. Here’s the article (with Jennifer and Masanao), to appear in the Journal of Research on Educational Effectiveness. (Sounds like an obscure outlet but according to Jennifer it’s read by the right people. Education researchers are very interested in multiple comparisons.)