Phil Harris and Bruce Smith write:
We have only recently chanced upon your blog while we were looking for responses to the “decline effect” in medical and scientific studies. We were especially taken by your comment that there is something wrong with the scientific method “if this method is defined as running experiments and doing data analysis in a patternless way and then reporting, as true, results that pass a statistical significance threshold.” To us, that seems to be standard operating procedure in much of social science, including our own field of education. Indeed quasi-experimental designs are the stock in trade of those who attempt to use “science” — we dare not say haruspicy, but you can if you like — to influence the course of public policy.
Now, a new entrant into the cherry pickers sweepstakes seems to have emerged. It is on Charter School Performance in Indiana Schools. We are by no means professional statisticians or data analysts, but we have some background in the area and have been long-time skeptical consumers of this kind of research. We were impressed by the clarity of your explanations of the problems with the “decline effect,” and we were wondering if you might give us and others who follow your blog an opinion on this latest effort to make small differences carry great weight.
We’re troubled by a number of features of the “design” of this Indiana study, as well as by the weight given to some pretty small differences with some pretty small numbers (in the cases where we can even tell what the numbers are). For instance, when you take a sample of roughly 7,500 (84% of the total) and divide into grade levels and then further divide it by subject (reading or math), you can see the size of the sample dwindle before your eyes. We’re also troubled by the effort to standardize the scores of charter students and “virtual” students in an effort to make direct comparison possible. And we wonder about just how large a fractional gain in the standard deviation would be meaningful (we’re avoiding significant) in these comparisons. We’ve noted a few other problems, and we’re sure you’ll see more than we do. In the document we’re attaching, there is a link to the “technical report” [I think they're talking about this---ed.] which you will probably want to examine as well. It seems to us that technical reports and public releases are sometimes only distant kin.
The key step in the analysis is the matching of students in the charter schools to comparable students in the public schools. If you think there’s a problem here, I’d guess it would be in the matching, that the students in the two groups are not truly comparable. The variables they match on seem reasonable, but maybe there are some important factors they’re missing. If so, perhaps there’s some way to measure (directly or indirectly) the differences between the groups.
Regarding your other points:
- A difference of 0.05 standard deviations is not huge huge but it’s not trivial either. In other studies, teacher effects have been estimated to be in the range of 0.15 standard deviations. It seems plausible to me that charter schools have slightly better teachers and thus do a better job of educating students.
- I’m not so worried about the researchers breaking the data into subgroups. I agree that they aren’t getting much out of the subgroup analysis—basically, they have a large main effect and then they see similar effects when they break up the data into pieces—but there’s no real harm done either.