More bad news: The (mis)reporting of statistical results in psychology journals

Another entry in the growing literature on systematic flaws in the scientific research literature.

This time the bad tidings come from Marjan Bakker and Jelte Wicherts, who write:

Around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers’ expectations.

Their research also had a qualitative component:

To obtain a better understanding of the origins of the errors made in the reporting of statistics, we contacted the authors of the articles with errors in the second study and asked them to send us the raw data. Regrettably, only 24% of the authors shared their data, despite our request being quite specific and our assurances that the authors would remain anonymous. . . .

The paper by Bakker and Wicherts features a truly ugly graph (Figure 2) and also breaks a rule by reporting percentages to inappropriate precision (no, you don’t have to categorize 33/113 as “29.2%”), but I’ll forgive them because I like this sort of work. It’s important and represents a lot of effort. Personally, I think Jelte Wicherts, E. J. Wagenmakers, and John Ioannidis are much more deserving of the ASA Founders Award than is, say, I dunno, Ed Wegman?

Articles on the philosophy of Bayesian statistics by Cox, Mayo, Senn, and others!

Deborah Mayo, Aris Spanos, and Kent Staley edited a special issue on the philosophy of Bayesian statistics for the journal Rationality, Markets and Morals.

Here are the contents:

David Cox and Deborah G. Mayo, “Statistical Scientist Meets a Philosopher of Science: A Conversation”

Deborah G. Mayo, “Statistical Science and Philosophy of Science: Where Do/Should They Meet in 2011 (and Beyond)?”

Stephen Senn, “You May Believe You Are a Bayesian But You Are Probably Wrong”

Andrew Gelman, “Induction and Deduction in Bayesian Data Analysis

Jan Sprenger, “The Renegade Subjectivist: Jose Bernardo’s Objective Bayesianism”

Aris Spanos. “Foundational Issues in Statistical Modeling: Statistical Model Specification and Validation”

David F. Hendry, “Empirical Economic Model Discovery and Theory Evaluation”

Larry Wasserman, “Low Assumptions, High Dimensions”

For some reason, not all the articles are yet online, but it says they’re coming soon. In the meantime, you can check out what Senn and I have to say.

Once all the articles are up, I’ll read them and write something in response.

Wiley Wegman chutzpah update: Now you too can buy a selection of garbled Wikipedia articles, for a mere $1400-$2800 per year!

Someone passed on to a message from his university library announcing that the journal “Wiley Interdisciplinary Reviews: Computational Statistics” is no longer free.

Librarians have to decide what to do, so I thought I’d offer the following consumer guide:

Wiley Computational Statistics journal Wikipedia
Frequency 6 issues per year Continuously updated
Includes articles from Wikipedia? Yes Yes
Cites the Wikipedia sources it uses? No Yes
Edited by recipient of ASA Founders Award? Yes No
Articles are subject to rigorous review? No Yes
Errors, when discovered, get fixed? No Yes
Number of vertices in n-dimensional hypercube? 2n 2n
Easy access to Brady Bunch trivia? No Yes
Cost (North America) $1400-$2800 $0
Cost (UK) £986-£1972 £0
Cost (Europe) €1213-€2426 €0

The choice seems pretty clear to me!

It’s funny for the Wiley journal to start charging now for access. Unless they can convince Wikipedia to (a) charge at least $1401/year and (b) introduce errors into their articles to level the playing field, I think Wegman’s journal is going to have difficulty competing in the free market.

Visual diagnostics for discrete-data regressions

Jeff asked me what I thought of this recent AJPS article by Brian Greenhill, Michael Ward, and Audrey Sacks, “The Separation Plot: A New Visual Method for Evaluating the Fit of Binary Models.” It’s similar to a graph of observed vs. predicted values, but using color rather than the y-axis to display the observed values. It seems like it could be useful, also could be applied more generally to discrete-data regressions with more than two categories.

When it comes to checking the model fit, I recommend binned residual plots, as discussed in this 2000 article with Yuri Goegebeur, Francis Tuerlinckx, and Iven Van Mechelen.

R and Google Visualization

Eric Tassone writes:

Here’s something that may be of interest and useful to your readers, and which I [Tassone] am just now checking out myself. It links R and the Google Visualization API/Google Chart Tools to make Motion Charts (as used in the well known Hans Rosling TED talk) easier to create directly in R.

The website is here, and here‘s a blog about how to use it, including some R code that actually works (if the user has all the requisite libraries, of course) in your own browser.

NYC

Our downstairs neighbor hates us. She looks away from us when we see them on the street, if we’re coming into the building at the same time she doesn’t hold open the door, and if we’re in the elevator when it stops on her floor, she refuses to get on.

On the other hand, if you’re a sociology professor in Chicago, one of your colleagues might try to run you over in a parking lot. So I guess I’m getting off easy.

Ethnicity and Population Structure in Personal Naming Networks

Aleks pointed me to this recent article by Pablo Mateos, Paul Longley, and David O’Sullivan on one of my favorite topics.

The authors produced a potentially cool naming network of the city of Auckland New Zealand. I say “potentially cool” because I have such difficulty reading the article–I speak English, statistics, and a bit of political science and economics, but this one is written in heavy sociologese–that I can’t quite be sure what they’re doing. However, despite my (perhaps unfair) disdain for the particulars of their method, it’s probably good that they’re jumping in with this analysis. Others can take their data (and similar datasets from elsewhere) and do better. Ya gotta start somewhere, and the basic idea (to cluster first names that are associated with the same last names, and to cluster last names that are associated with the same first names) seems good.

I have to admit, though, that I was amused by the following line, which, amazingly, led off the paper:

Personal naming practices exist in all human groups and are far from random.

Far from random, huh? Who’d a thunk it?

And also this:

Researchers have automatically classified the 2.5 million users of a mobile phone operator in Belgium into French and Flemish speaking communities based exclusively on the topological network structure of their 800 million phone calls and texts interactions [9]. In doing so they have demonstrated the enduring importance of linguistic and geographical barriers in the age of global mobile communications, and more importantly, that they can automatically be detected using network analysis.

OK, sure, any analysis of 2.5 million users is impressive on computational grounds alone, but . . . it’s hard to be impressed that you can automatically partition phone calls and texts from two different languages, right? It’s fine to do, but it’s hardly news that people like to talk in their own language.

This is partly what goes into the “sociologese” style of writing: a sort of flattening of affect, in which seemingly strange behaviors or findings are presented deadpan, while unremarkable observations can be touted as important.

P.S. [just added] This was just a coincidence—the above post about a month ago and was waiting its turn in the queue, whereas the item from yesterday was more recent—but it’s funny that I slammed economists one day and sociologist the next. I’m just full of stereotypes this week, I guess!

Economists don’t think like accountants—but maybe they should

Joseph Delaney quotes Frances Woolley:

In other words, the reason we care about inequality is that it reduces the happiness achievable from a given amount of income. How much depends upon the happiness/income relationship. Does the marginal utility of income fall rapidly? Or is the happiness from the 100,000th dollar almost as great as the happiness from the 100th?

Delaney goes on to discuss the marginal utility of income, but I have a different point to make.

Woolley is a professor of economics.

Economists seem to rely heavily on a sort of folk psychology, a relic of the 1920s-1950s in which people calculate utilities (or act as if they are doing so) in order to make decisions. A central tenet of economics is that inference or policy recommendation be derived from first principles from this folk-psychology model.

This just seems silly to me, as if astronomers justified all their calculations with an underlying appeal to Aristotle’s mechanics. Or maybe the better analogy is the Stalinist era in which everything had to be connected to Marxist principles (followed, perhaps, by an equationful explanation of how the world can be interpreted as if Marxism were valid).
Continue reading

“Income can’t be used to predict political opinion”

What really irritates me about this column (by John Steele Gordon) is not how stupid it is (an article about “millionaires” that switches within the very same paragraph between “a nest egg of $1 million” and “a $1 million annual income” without acknowledging the difference between these concepts) or the ignorance it displays (no, it’s not true that “McCain carried the middle class” in 2008—unless by “middle class” you mean “middle class whites”).

No, what really ticks me off is that, when the Red State Blue State book was coming out, we pitched a “5 myths” article for the Washington Post, and they turned us down! Perhaps the rule is: if it’s in the Opinions section of the paper, it can’t contain any facts? Or, to be more precise, any facts it contains must be counterbalanced by an equal number of inanities?

Grrrrr . . . I haven’t been so annoyed since reading that New York Times article that argued that electoral politics is just like high school. Who needs political science or economics when you can resolve all confusion with an appropriate Simpsons reference?

P.S. I’m not opposed to someone arguing for upper-income tax breaks. But couldn’t the Post have found a conservative who could make the case in an intelligent way?

That odd couple, “subjectivity” and “rationality”

Nowadays “Bayesian” is often taken to be a synonym for rationality, and I can see how this can irritate thoughtful philosophers and statisticians alike: To start with, lots of rational thinking—even lots of rational statistical inference—does not occur within the Bayesian formalism. And, to look at it from the other direction, lots of self-proclaimed Bayesian inference hardly seems rational at all. And in what way is “subjective probability” a model for rational scientific inquiry? On the contrary, subjectivity and rationality are in many ways opposites! [emphasis added]

The goal of this paper is to break the link between Bayesian modeling (good, in my opinion) and subjectivity (bad). From this perspective, the irritation of falsificationists regarding exaggerated claims of Bayesian rationality are my ally. . . .

See here for the full article, to appear in the journal Rationality, Markets and Morals.

Avoiding boundary estimates in linear mixed models

Pablo Verde sends in this letter he and Daniel Curcio just published in the Journal of Antimicrobial Chemotherapy. They had published a meta-analysis with a boundary estimate which, he said, gave nonsense results. Here’s Curcio and Verde’s key paragraph:

The authors [of the study they are criticizing] performed a test of heterogeneity between studies. Given that the test result was not significant at 5%, they decided to pool all the RRs by using a fixed-effect meta-analysis model. Unfortunately, this is a common practice in meta-analysis, which usually leads to very misleading results. First of all, the pooled RR as well as its standard error are sensitive to 2 the estimation of the between-studies standard deviation (SD). SD is difficult to estimate with a small number of studies. On the other hand, it is very well known that the significant test of hetero- geneity lacks statistical power to detect values of SD greater than zero. In addition, the statistically non-significant results of this test cannot be interpreted as evidence of the homogeneity of the results among all RCTs included.

How can you generally avoid boundary estimates of multilevel variance parameters? Using our cute little trick, implemented in blmer/bglmer in the blme package in R.

Another Wegman plagiarism copying-without-attribution, and further discussion of why scientists cheat

Copying from Wikipedia but introducing an error in the process . . . how tacky is that??

I’ll discuss another minor outrage and then consider the more general question of what motivates researchers to plagiarize and otherwise break the rules of scholarship.

If you’re gonna steal from Wikipedia, remember to preserve formatting or you might end up embarrassing yourself

John Mashey pointed me to this:
Continue reading