Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

Hey—here’s a tip from the biology literature: If your correlation is .02, try binning your data to get a correlation of .8 or .9!

Josh Cherry writes: This isn’t in the social sciences, but it’s an egregious example of statistical malpractice: Below the abstract you can find my [Cherry’s] comment on the problem, which was submitted as a letter to the journal, but rejected on the grounds that the issue does not affect the main conclusions of the article […]

Racial classification sociology controversy update

The other day I posted on a controversy in sociology where Aliya Saperstein and Andrew Penner analyzed data from the National Longitudinal Survey of Youth, coming to the conclusion that “that race is not a fixed characteristic of individuals but is flexible and continually negotiated in everyday interactions,” but then Lance Hannon and Robert DeFina […]

“What is a good, convincing example in which p-values are useful?”

A correspondent writes: I came across this discussion of p-values, and I’d be very interested in your thoughts on it, especially on the evaluation in that thread of “two major arguments against the usefulness of the p-value:” 1. With large samples, significance tests pounce on tiny, unimportant departures from the null hypothesis. 2. Almost no […]

Taking responsibility for your statistical conclusions: You must decide what variation to compare to.

A couple people pointed me to a recent paper by Josh Terrell, Andrew Kofink, Justin Middleton, Clarissa Rainear, Emerson Murphy-Hill​, and Chris Parnin, “Gender bias in open source: Pull request acceptance of women versus men.” The term “bias” seems a bit loaded given the descriptive nature of their study. That said, it’s good for people […]

“The Natural Selection of Bad Science”

That’s the title of a new paper by Paul Smaldino and Richard McElreath which presents a sort of agent-based model that reproduces the growth in the publication of junk science that we’ve seen in recent decades. Even before looking at this paper I was positively disposed toward it for two reasons. First because I do […]

“Replication initiatives will not salvage the trustworthiness of psychology”

So says James Coyne, going full Meehl. I agree. Replication is great, but if you replicate noise studies, you’ll just get noise, hence the beneficial effects on science are (a) to reduce confidence in silly studies that we mostly shouldn’t have taken seriously in the first place, and (b) to provide an disincentive for future […]

All that really important statistics stuff that isn’t in the statistics textbooks

Kaiser writes: More on that work on age adjustment. I keep asking myself where is it in the Stats curriculum do we teach students this stuff? A class session focused on that analysis teaches students so much more about statistical thinking than anything we have in the textbooks. I’m not sure. This sort of analysis […]

Should he major in political science and minor in statistics or the other way around?

Andrew Wheeler writes: I will be a freshman at the University of Florida this upcoming fall and I am interested in becoming a political pollster. My original question was whether I should major in political science and minor in statistics or the other way around, but any other general advice would be appreciated. My reply: […]

“99.60% for women and 99.58% for men, P < 0.05.”

Gur Huberman pointed me to this paper by Tamar Kricheli-Katz and Tali Regev, “How many cents on the dollar? Women and men in product markets.” It appeared in something called ScienceAdvances, which seems to be some extension of the Science brand, i.e., it’s in the tabloids! I’ll leave the critical analysis of this paper to […]

Now that’s what I call a power pose!

John writes: See below for your humour file or blogging on a quiet day. . . . Perhaps you could start a competition for the wackiest real-life mangling of statistical concepts (restricted to a genuine academic setting?). On 15 Feb 2016, at 5:25 PM, [****] wrote: Pick of the bunch from tomorrow’s pile of applications […]

Happy talk, meet the Edlin factor

Mark Palko points us to this op-ed in which psychiatrist Richard Friedman writes: There are also easy and powerful ways to enhance learning in young people. For example, there is intriguing evidence that the attitude that young people have about their own intelligence — and what their teachers believe — can have a big impact […]

“Null hypothesis” = “A specific random number generator”

In an otherwise pointless comment thread the other day, Dan Lakeland contributed the following gem: A p-value is the probability of seeing data as extreme or more extreme than the result, under the assumption that the result was produced by a specific random number generator (called the null hypothesis). I could care less about p-values […]

Gary Venter’s age-period-cohort decomposition of US male mortality trends

Following up on yesterday’s post on mortality trends, I wanted to share with you a research note by actuary Gary Venter, “A Quick Look at Cohort Effects in US Male Mortality.” Venter produces this graph: And he writes: Cohort effects in mortality tend to be difficult to explain. Often strings of coincidences are invoked – […]

If Yogi Berra could see this one, he’d spin in his grave: Regression modeling using a convenience sample

Kelvin Leshabari writes: We are currently planning to publish some few manuscripts on the outcome of treatment of some selected cancers occuring in children. The current dataset was derived from the natural admission process of those children with cancer found at a selected tertiary cancer centre. To the best of our understanding, our data are […]

“Cancer Research Is Broken”

Michael Oakes pointed me to this excellent news article by Daniel Engber, subtitled, “There’s a replication crisis in biomedicine—and no one even knows how deep it runs.” Engber suggests that the replication problem in biomedical research is worse than the much-publicized replication problem in psychology. One reason, which I didn’t see Engber discussing, is financial […]

“if you add a few more variables, you can do a better job at predictions”

Ethan Bolker points me to this news article by Neil Irwin: Robert J. Gordon, an economist at Northwestern University, has his own version that he argues explains inflation levels throughout recent decades. But it is hardly simple. Its prediction for inflation relies not just on joblessness but also on measures of productivity growth, six shifts […]

Black Box Challenge

Georgy Cheremovskiy writes: I’m one of the organizers of an unusual reinforcement learning competition named Black Box Challenge. The conception is simple — one need to program an agent that can play a game with unknown rules. At each time step agent is given an environment state vector and has a few possible actions. The […]

These celebrity photos are incredible: Type S errors in use!

Kaveh sends along this, from a recent talk at Berkeley by Katherine Casey: It’s so gratifying to see this sort of thing in common use, only 15 years after Francis and I introduced the idea (and see also this more recent paper with Carlin).

Best Disclaimer Ever

Paul Alper sends this in, from the article, “Ovarian cancer screening and mortality in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial,” by Ian J Jacobs, Usha Menon, Andy Ryan, Aleksandra Gentry-Maharaj, Matthew Burnell, Jatinderpal K Kalsi, Nazar N Amso, Sophia Apostolidou, Elizabeth Benjamin, Derek Cruickshank, Danielle N Crump, Susan […]

A question about software for an online survey

Michael Smith writes: I have a research challenge and I was hoping you could spare a minute of your time. I hope it isn’t a bother—I first came across you when I saw your post on how psychology researchers can learn from statisticians. I figure even if you don’t know the answer to this question, […]