Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

If you leave your datasets sitting out on the counter, they get moldy

I received the following in the email: I had a look at the dataset on speed dating you put online, and I found some big inconsistencies. Since a lot of people are using it, I hope this can help to fix them (or hopefully I did a mistake in interpreting the dataset). Here are the […]

“We can keep debating this after 11 years, but I’m sure we all have much more pressing things to do (grants? papers? family time? attacking 11-year-old papers by former classmates? guitar practice?)”

Someone pointed me to this discussion by Lior Pachter of a controversial claim in biology. The statistics The statistical content has to do with a biology paper by M. Kellis, B. W. Birren, and E.S. Lander from 2004 that contains the following passage: Strikingly, 95% of cases of accelerated evolution involve only one member of […]

Ira Glass asks. We answer.

The celebrated radio quiz show star says: There’s this study done by the Pew Research Center and Smithsonian Magazine . . . they called up one thousand and one Americans. I do not understand why it is a thousand and one rather than just a thousand. Maybe a thousand and one just seemed sexier or […]

Measurement is part of design

The other day, in the context of a discussion of an article from 1972, I remarked that the great statistician William Cochran, when writing on observational studies, wrote almost nothing about causality, nor did he mention selection or meta-analysis. It was interesting that these topics, which are central to any modern discussion of observational studies, […]

Survey weighting and regression modeling

Yphtach Lelkes points us to a recent article on survey weighting by three economists, Gary Solon, Steven Haider, and Jeffrey Wooldridge, who write: We start by distinguishing two purposes of estimation: to estimate population descriptive statistics and to estimate causal effects. In the former type of research, weighting is called for when it is needed […]

Don’t do the Wilcoxon

The Wilcoxon test is a nonparametric rank-based test for comparing two groups. It’s a cool idea because, if data are continuous and there is no possibility of a tie, the reference distribution depends only on the sample size. There are no nuisance parameters, and the distribution can be tabulated. From a Bayesian point of view, […]

Inauthentic leadership? Development and validation of methods-based criticism

Thomas Basbøll writes: I need some help with a critique of a paper that is part of the apparently growing retraction scandal in leadership studies. Here’s Retraction Watch. The paper I want to look at is here: “Authentic Leadership: Development and Validation of a Theory-Based Measure” By F. O. Walumbwa, B. J. Avolio, W. L. […]

Discreteland and Continuousland

Roy Mendelssohn points me to this paper by Jianqing Fan, Qi-Man Shao, and Wen-Xin Zhou, “Are Discoveries Spurious? Distributions of Maximum Spurious Correlations and Their Applications.” I never know what to think about these things because I don’t work in a discrete world in which there are zero effects (see our earlier discussion of the […]

“Menstrual Cycle Phase Does Not Predict Political Conservatism”

Someone pointed me to this article by Isabel Scott and Nicholas Pound: Recent authors have reported a relationship between women’s fertility status, as indexed by menstrual cycle phase, and conservatism in moral, social and political values. We conducted a survey to test for the existence of a relationship between menstrual cycle day and conservatism. 2213 […]

God is in every leaf of every probability puzzle

Radford shared with us this probability puzzle of his from 1999: A couple you’ve just met invite you over to dinner, saying “come by around 5pm, and we can talk for a while before our three kids come home from school at 6pm”. You arrive at the appointed time, and are invited into the house. […]

What’s So Fun About Fake Data?

Our first Daily Beast column is here.

Our new column in the Daily Beast

Kaiser Fung and I have a new weekly column for the Daily Beast. After much deliberation, we gave it the title Statbusters (the runner-up choice was Dirty Data; my personal preference was Statboyz in the Hood, but, hey, who ever listens to me on anything?). The column will appear every Saturday, and Kaiser and I […]

“When more data steer us wrong: replications with the wrong dependent measure perpetuate erroneous conclusions”

Evan Heit sent in this article with Caren Rotello and Chad Dubé: There is a replication crisis in science, to which psychological research has not been immune: Many effects have proven uncomfortably difficult to reproduce. Although the reliability of data is a serious concern, we argue that there is a deeper and more insidious problem […]

Statistics Be

This modern statistics got me confused, To tell you friends I’m quite unenthused. This modern statistics got me confused, To tell you friends I’m quite unenthused. I like Pee Wee Fisher or the great Jerzy But can’t make head nor tail of this Robby Tibsh’rani With his Oop-pop-a-da Be-a-ba-du-la-be-plee Ple-oobly-oobly-oobly-oobie Chum-cheeree-a-bah Oop-pop-a-dee-de-doom ah-ah! Robby Tibsh’rani […]

In which a complete stranger offers me a bet

Piotr Mitros wrote to Deb and me: I read, with pleasure, your article about the impossibility of biasing a coin. I’m curious as to whether researchers believe what they write. Would you be willing to place some form of iterated bet? For example: I provide a two-sided coin and a table. The table looks like […]

You can crush us, you can bruise us, yes, even shoot us, but oh—not a pie chart!

Byron Gajewski pointed me to this several-years-old article from the Onion, which begins: According to a groundbreaking new study published Monday in The Journal Of The American Statistical Association, somewhere on the planet someone is totally doing it at this very moment. “Of the 6.7 billion inhabitants of Earth, approximately 3.5 billion have reached sexual […]

The language of insignificance

Jonathan Falk points me to an amusing post by Matthew Hankins giving synonyms for “not statistically significant.” Hankins writes: The following list is culled from peer-reviewed journal articles in which (a) the authors set themselves the threshold of 0.05 for significance, (b) failed to achieve that threshold value for p and (c) described it in […]

Does your time as a parent make a difference?

A colleague writes: Thought you might be interested in this front page data journalism take down of an article. I don’t know the article but this amounts to a journalist talking with someone who didn’t like the piece and ripping it based on a measurement detail. How bad though is this measurement detail? Are you […]

Postdoc in psychometrics in Cardiff

Richard Morey writes: I have a PhD position available at Cardiff University that I was hoping you might be able to publicise on your blog. It is for UK/EU students, and the project is negotiable but should be methods or statistical cognition. Here’s the link:

What to do to train to apply statistical models to political science and public policy issues

Taylor Good writes: I am a graduate of a state school with a BS in Math and a BA in Political Science, and I was wondering if you could give me some career advice. Knowing how you got to where you are now, what path would you advise someone to take to get to where […]