Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

The next Lancet retraction? [“Subcortical brain volume differences in participants with attention deficit hyperactivity disorder in children and adults”]

Someone who prefers to remain anonymous asks for my thoughts on this post by Michael Corrigan and Robert Whitaker, “Lancet Psychiatry Needs to Retract the ADHD-Enigma Study: Authors’ conclusion that individuals with ADHD have smaller brains is belied by their own data,” which begins: Lancet Psychiatry, a UK-based medical journal, recently published a study titled […]

Teaching Statistics: A Bag of Tricks (second edition)

Hey! Deb Nolan and I finished the second edition of our book, Teaching Statistics: A Bag of Tricks. You can pre-order it here. I love love love this book. As William Goldman would say, it’s the “good parts version”: all the fun stuff without the standard boring examples (counting colors of M&M’s, etc.). Great stuff […]

My proposal for JASA: “Journal” = review reports + editors’ recommendations + links to the original paper and updates + post-publication comments

Whenever they’ve asked me to edit a statistics journal, I say no thank you because I think I can make more of a contribution through this blog. I’ve said no enough times that they’ve stopped asking me. But I’ve had an idea for awhile and now I want to do it. I think that journals […]

My talk this Friday in the Machine Learning in Finance workshop

This is kinda weird because I don’t know anything about machine learning in finance. I guess the assumption is that statistical ideas are not domain specific. Anyway, here it is: What can we learn from data? Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University The standard framework for statistical inference leads […]

The Efron transition? And the wit and wisdom of our statistical elders

Stephen Martin writes: Brad Efron seems to have transitioned from “Bayes just isn’t as practical” to “Bayes can be useful, but EB is easier” to “Yes, Bayes should be used in the modern day” pretty continuously across three decades. http://www2.stat.duke.edu/courses/Spring10/sta122/Handouts/EfronWhyEveryone.pdf http://projecteuclid.org/download/pdf_1/euclid.ss/1028905930 http://statweb.stanford.edu/~ckirby/brad/other/2009Future.pdf Also, Lindley’s comment in the first article is just GOLD: “The last example […]

Beyond subjective and objective in statistics: my talk with Christian Hennig tomorrow (Wed) 5pm in London

Christian Hennig and I write: Decisions in statistical data analysis are often justified, criticized, or avoided using concepts of objectivity and subjectivity. We argue that the words “objective” and “subjective” in statistics discourse are used in a mostly unhelpful way, and we propose to replace each of them with broader collections of attributes, with objectivity […]

Probability and Statistics in the Study of Voting and Public Opinion (my talk at the Columbia Applied Probability and Risk seminar, 30 Mar at 1pm)

Probability and Statistics in the Study of Voting and Public Opinion Elections have both uncertainty and variation and hence represent a natural application of probability theory. In addition, opinion polling is a classic statistics problem and is featured in just about every course on the topic. But many common intuitions about probability, statistics, and voting […]

Some natural solutions to the p-value communication problem—and why they won’t work

Blake McShane and David Gal recently wrote two articles (“Blinding us to the obvious? The effect of statistical training on the evaluation of evidence” and “Statistical significance and the dichotomization of evidence”) on the misunderstandings of p-values that are common even among supposed experts in statistics and applied social research. The key misconception has nothing […]

Lady in the Mirror

In the context of a report from a drug study, Stephen Senn writes: The bare facts they established are the following: The International Headache Society recommends the outcome of being pain free two hours after taking a medicine. The outcome of being pain free or having only mild pain at two hours was reported by […]

“Beyond Heterogeneity of Effect Sizes”

Piers Steel writes: One of the primary benefits of meta-analytic syntheses of research findings is that researchers are provided with an estimate of the heterogeneity of effect sizes. . . . Low values for this estimate are typically interpreted as indicating that the strength of an effect generalizes across situations . . . Some have […]

How is preregistration like random sampling and controlled experimentation

In the discussion following my talk yesterday, someone asked about preregistration and I gave an answer that I really liked, something I’d never thought of before. I started with my usual story that preregistration is great in two settings: (a) replicating your own exploratory work (as in the 50 shades of gray paper), and (b) […]

How to do a descriptive analysis using regression modeling?

Freddy Garcia writes: I read your post Vine regression?, and your phrase “I love descriptive data analysis!” make me wonder: How to do a descriptive analysis using regression models? Maybe my question could be misleading to an statistician, but I am a economics student. So we are accustomed to think in causal terms when we […]

Advice when debugging at 11pm

Add one feature to your model and test and debug with fake data before going on. Don’t try to add two features at once.

Checkmate

Sandro Ambuehl writes: As an avid reader of your blog, I thought you might like (to hate) the attached PNAS paper with the following findings: (i) sending two flyers about the importance of STEM fields to the parents of 81 kids improves ACT scores by 12 percentile points (intent-to-treat effect… a bit large, perhaps?) and […]

Yes, it makes sense to do design analysis (“power calculations”) after the data have been collected

This one has come up before but it’s worth a reminder. Stephen Senn is a thoughtful statistician and I generally agree with his advice but I think he was kinda wrong on this one. Wrong in an interesting way. Senn’s article is from 2002 and it is called “Power is indeed irrelevant in interpreting completed […]

Theoretical statistics is the theory of applied statistics: how to think about what we do (My talk Wednesday—today!—4:15pm at the Harvard statistics dept)

Theoretical statistics is the theory of applied statistics: how to think about what we do Andrew Gelman, Department of Statistics and Department of Political Science, Columbia University Working scientists and engineers commonly feel that philosophy is a waste of time. But theoretical and philosophical principles can guide practice, so it makes sense for us to […]

Ethics and the Replication Crisis and Science (my talk Tues 6pm)

I’ll be speaking on Ethics and the Replication Crisis and Science tomorrow (Tues 28 Feb) 6-7:30pm at room 411 Fayerweather Hall, Columbia University. I don’t plan to speak for 90 minutes; I assume there will be lots of time for discussion. Here’s the abstract that I whipped up: Busy scientists sometimes view ethics and philosophy […]

Forecasting mean and sd of time series

Garrett M. writes: I had two (hopefully straightforward) questions related to time series analysis that I was hoping I could get your thoughts on: First, much of the work I do involves “backtesting” investment strategies, where I simulate the performance of an investment portfolio using historical data on returns. The primary summary statistics I generate […]

Is Rigor Contagious? (my talk next Monday 4:15pm at Columbia)

Is Rigor Contagious? Much of the theory and practice of statistics and econometrics is characterized by a toxic mixture of rigor and sloppiness. Methods are justified based on seemingly pure principles that can’t survive reality. Examples of these principles include random sampling, unbiased estimation, hypothesis testing, Bayesian inference, and causal identification. Examples of uncomfortable reality […]

He wants to know what book to read to learn statistics

Tim Gilmour writes: I’m an early 40s guy in Los Angeles, and I’m sort of sending myself back to school, specifically in statistics — not taking classes, just working through things on my own. Though I haven’t really used math much since undergrad, a number of my personal interests (primarily epistemology) would be much better […]