I think the book by Westfall and Young (1983) on Resampling based methods for multiple testing is an excellent starting point.

http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0471557617.html

Aside from the many problems they cite with the incorrect applications of p-values, they give a number of humorous references including one letter to the editor of The Lancet on the Munchhausen framework for making all tests significant (which fits in exactly with the XKCD cartoon on significance).

W-Y is also a very well cited work in the econometrics and finance literature with the papers by White, by Romano-Wolf, and by Hansen among other (including Campbell Harvey in his various addresses, equating the problem in science to finding “Jesus on toast” apophanies). These are serious attempts to use bootstrap to estimate correlations in order to make more powerful corrections to multiple tests than the standard Bonferroni, Holm and Benjamini et (BHY) adjustments. This is one of the several strands of literature that attempts to make reasoned corrections to the standard, overly-abused p-value framework.

]]>http://journals.sagepub.com/doi/abs/10.1177/0959354314529616 ]]>

If you do any philosophy of probability, Alan Hajek is great. I think you can do better than Jaynes – philosophers (including Hajek, Forster, and Sober) have more nuanced views on the subjects Jaynes tackles. (Though MaxEnt seems worth knowing about.)

Maybe some de Finetti.

]]>I’m not totally sure if, when you say you “wouldn’t typically teach the methods…”, you mean that you haven’t traditionally but plan to or won’t in any case. I took your original post to mean that you would need to work through the logic of some methods, even while avoiding the procedural and math bits.

]]>But as they say, once you notice your own confirmation bias, you start seeing it everywhere. :-)

[All: this will be my last post on this hobbyhorse.]

]]>https://www.extension.harvard.edu/academics/courses/inquiries-probability-statistics/15472

There exists an online option. If you click through, his syllabus does not (yet?) list the readings, but I would imagine that will change. ]]>

https://xkcd.com/882/ ]]>

My primary guide to the Neyman-Pearson/Fisher split is Michael Lew (e.g. nice commentary and links to approachable articles here: https://stats.stackexchange.com/a/4567/10506). Deborah Mayo, Jaynes, and Richard Morey et al (https://learnbayes.org/papers/confidenceIntervalsFallacy/CItheory.html), also come to mind as interesting sources.

]]>https://www.jstor.org/stable/pdf/1803924.pdf

He talks about economics but the ideas translate broadly. There are some technical parts that require a basic familiarity with regression, but these are couched in real world examples. He also makes some claims about the relationship between “fact” and “opinion” that plenty of readers will take issue with – but that’s a good thing for a philosophy class.

]]>For history, Polya’s early work is interesting:

https://archive.org/details/Induction_And_Analogy_In_Mathematics_1_

https://archive.org/details/Patterns_Of_Plausible_Inference_2_

]]>There’s plenty of room for philosophy in the traditional intro stat course, especially once you’ve removed the tedious and pointless bits, such as calculating everything by hand, looking up numbers on a table, and distinguishing between (for example) 1.96 standard errors and 2.04 standard errors.

]]>I like the idea of including some modern resources on philstats along in the context of the usual epistemology, ontology, etc. Trying to teach p-values in that context (just to smack them down?) sounds tricky, though. For instance, I quite like Gelman/Carlin paper about p-value communication, but I’m not sure it would mean much to me if I had _just_ learned what a p-value even was. Maybe more time on introduction and discussion of the relation of probability distributions and “reality” would be more effective.

]]>Brieman’s ‘Two cultures’ paper

Dawid’s ‘Beware of the DAG’

http://www.sas.rochester.edu/psc/clarke/405/Freedman91.pdf

Unlike most contributions to the “be wary of statistical significance arguments”, this one is based in observational thinking and not experimental thinking. And since much (¿most?) social science is and will remain observational, I think it is an important bridge between the purely mathematical p-value-based critiques of Ioannidis and the P-curve folks, on the one head, and the kind of papers people interested in social science will actually read on the other.

I think that one of the most important philosophical concepts in causal inference in social science at the moment is probably the idea of quasi-experimental variation that is “as good as random”. And I think that the Shoeleather paper does a good job of linking the “as good as random” idea to actual research. Beyond that, I suspect that there are real contributions to be made by philosophers linking the concept of “as good as random” to causal inference (maybe I just suspect that we are sweeping more under the rug than we think).

]]>–Or Vice Versa, published in J Am Stat Assoc.

“One thing is clear, however. The author’s stated risk cannot be accepted at its face value once the author’s conclusions appear in print.” In other words, the experimenter does not need to condition on publication when she updates her beliefs — but the reader does need to condition thusly.

Going back another 32 years, this very short letter is going around Twitter right now: http://jamanetwork.com/journals/jama/article-abstract/244730

Cheers,

Carl

Andrew: I’ll bet if we taught it together, there’d be lots of interested students! Maybe next year.

]]>