Skip to content
Archive of posts filed under the Miscellaneous Statistics category.

The inevitable problems with statistical significance and 95% intervals

I’m thinking more and more that we have to get rid of statistical significance, 95% intervals, and all the rest, and just come to a more fundamental acceptance of uncertainty. In practice, I think we use confidence intervals and hypothesis tests as a way to avoid acknowledging uncertainty. We set up some rules and then [...]

Convenient page of data sources from the Washington Post

Wayne Folta points us to this list.

Chris Schmid on Evidence Based Medicine

Chris Schmid is a statistician at New England Medical Center who is an expert on evidence-based medicine. I invited him to present an introductory overview lecture on the topic at last year’s Joint Statistical Meetings, and here are his slides. All 123 of them. I don’t know how he expected to go though all of [...]

Difficulties in publishing non-replications of implausible findings

Eric Tassone points me to this news article by Christopher Shea on the challenges of debunking ESP. Shea writes: Earlier this year, a major psychology journal published a paper suggesting that there was some evidence for “pre-cognition,” a form of ESP. Stuart Ritchie, a doctoral student at the University of Edinburgh, is part of a [...]

Advice on do-it-yourself stats education?

Dustin Palmer writes: I am a recent graduate looking for a bit of advice. While I took intro classes on math and statistics in my undergraduate degree as a political science major, I find myself university-less and seeking to develop my statistics toolkit. I work for an NGO in the international development field. I think [...]

Excellence in Statistical Reporting Award

The American Statistical Association is seeking nominations for its annual Excellence in Statistical Reporting Award. The award was created in 2004 to encourage and recognize members of the communications media who have best displayed an informed interest in the science of statistics and its role in public life. The award can be given for a [...]

What are the important issues in ethics and statistics? I’m looking for your input!

I’ve recently started a regular column on ethics, appearing every three months in Chance magazine. My first column, “Open Data and Open Methods,” is here, and my second column, “Statisticians: When we teach, we don’t practice what we preach” (coauthored with Eric Loken) will be appearing in the next issue. Statistical ethics is a wide-open [...]

Jobs in statistics research! In New Jersey!

Kenny writes: The Statistics Research group in AT&T Labs invites applications for full time research positions. Applicants should have a Ph.D. in Statistics (or a related field), and be able to make major, widely-recognized contributions to statistics research: theory, methods, computing, and data analysis. Candidates must demonstrate a potential for excellence in research, a knowledge [...]

Intro to splines—with cool graphs

Ido Rosen pointed me to this page by Mike Kamermans.

Unconvincing defense of the recent Russian elections, and a problem when an official organ of an academic society has low standards for publication

Last month we reported on some claims of irregularities in the recent Russian elections. Just as a reminder, here are a couple graphs: Yesterday someone pointed me to two online articles: Mathematical proof of fraud in Russian elections unsound and US elections are as ‘non-normal’ as Russian elections. I know nothing about Russian elections and [...]

What are the standards for reliability in experimental psychology?

An experimental psychologist was wondering about the standards in that field for “acceptable reliability” (when looking at inter-rater reliability in coding data). He wondered, for example, if some variation on signal detectability theory might be applied to adjust for inter-rater differences in criteria for saying some code is present. What about Cohen’s kappa? The psychologist [...]

Martin and Liu: Probabilistic inference based on consistency of model with data

What better way to start then new year than with some hard-core statistical theory? Ryan Martin and Chuanhai Liu send along a new paper on inferential models: Probability is a useful tool for describing uncertainty, so it is natural to strive for a system of statistical inference based on probabilities for or against various hypotheses. [...]

Using factor analysis or principal components analysis or measurement-error models for biological measurements in archaeology?

Greg Campbell writes: I am a Canadian archaeologist (BSc in Chemistry) researching the past human use of European Atlantic shellfish. After two decades of practice I am finally getting a MA in archaeology at Reading. I am seeing if the habitat or size of harvested mussels (Mytilus edulis) can be reconstructed from measurements of the [...]

“The difference between . . .”: It’s not just p=.05 vs. p=.06

The title of this post by Sanjay Srivastava illustrates an annoying misconception that’s crept into the (otherwise delightful) recent publicity related to my article with Hal Stern, he difference between “significant” and “not significant” is not itself statistically significant. When people bring this up, they keep referring to the difference between p=0.05 and p=0.06, making [...]

Mr. Pearson, meet Mr. Mandelbrot: Detecting Novel Associations in Large Data Sets

Jeremy Fox asks what I think about this paper by David N. Reshef, Yakir Reshef, Hilary Finucane, Sharon Grossman, Gilean McVean, Peter Turnbaugh, Eric Lander, Michael Mitzenmacher, and Pardis Sabeti which proposes a new nonlinear R-squared-like measure. My quick answer is that it looks really cool! From my quick reading of the paper, it appears [...]

CrossValidated: A place to post your statistics questions

Seth Rogers writes: I [Rogers] am a member of an online community of statisticians where I burn a great deal of time (and a recovering cog sci researcher). Our community website is a peer-reviewed Q and A spanning stats topics ranging from applications to mathematical theory. Our online community consists of mostly university faculty, grad [...]

More frustrations trying to replicate an analysis published in a reputable journal

The story starts in September, when psychology professor Fred Oswald wrote me: I [Oswald] wanted to point out this paper in Science (Ramirez & Beilock, 2010) examining how students’ emotional writing improves their test performance in high-pressure situations. Although replication is viewed as the hallmark of research, this paper replicates implausibly large d-values and correlations [...]

I Am Too Absolutely Heteroskedastic for This Probit Model

Soren Lorensen wrote: I’m working on a project that uses a binary choice model on panel data. Since I have panel data and am using MLE, I’m concerned about heteroskedasticity making my estimates inconsistent and biased. Are you familiar with any statistical packages with pre-built tests for heteroskedasticity in binary choice ML models? If not, [...]

Kaiser Fung on how not to critique models

In the context of a debate between economists Brad DeLong and Tyler Cowen on the “IS-LM model” [no, I don't know what it is, either!], Kaiser writes: Since a model is an abstraction, a simplification of reality, no model is above critique. I [Kaiser] consider the following types of critique not deserving: 1) The critique [...]

Three hours in the life of a statistician

Kaiser Fung tells what it’s really like. Here’s a sample: As soon as I [Kaiser] put the substring-concatenate expression together with two lines of code that generate data tables, it choked. Sorta like Dashiell Hammett without the broads and the heaters. And here’s another take, from a slightly different perspective.

Chi-square FAIL when many cells have small expected values

William Perkins, Mark Tygert, and Rachel Ward write: If a discrete probability distribution in a model being tested for goodness-of-fit is not close to uniform, then forming the Pearson χ2 statistic can involve division by nearly zero. This often leads to serious trouble in practice — even in the absence of round-off errors . . [...]

Geophysicist Discovers Modeling Error (in Economics)

Continuing “heckle the press” month here at the blog, I (Bob) found the following “discovery” a little overplayed by David H. Freedman, who was writing for Scientific American in the following article and blog post: Blog: Why Economic Models are Always Wrong Article: A Formula for Economic Calamity The article’s paywalled, but the blog entry [...]

Apply now for Earth Institute postdoctoral fellowships at Columbia University

The economy isn’t going so well, but there are some interesting possibilities here at Columbia University. One such option that you should be thinking about is the Earth Institute Fellowship, which pays well, includes a research stipend, and puts you in an exciting interdisciplinary community of faculty and postdoctoral researchers. The Earth Institute at Columbia [...]

How do you interpret standard errors from a regression fit to the entire population?

David Radwin asks a question which comes up fairly often in one form or another: How should one respond to requests for statistical hypothesis tests for population (or universe) data? I [Radwin] first encountered this issue as an undergraduate when a professor suggested a statistical significance test for my paper comparing roll call votes between [...]

Bell Labs

Sining Chen told me they’re hiring in the statistics group at Bell Labs. I’ll do my bit for economic stimulus by announcing this job (see below). I love Bell Labs. I worked there for three summers, in a physics lab in 1985-86 under the supervision of Loren Pfeiffer, and by myself in the statistics group [...]