The $900 kindergarten teacher

Paul Bleicher writes:

This simply screams “post-hoc, multiple comparisons problem,” though I haven’t seen the paper.

A quote from the online news report:

The findings revealed that kindergarten matters–a lot. Students of kindergarten teachers with above-average experience earn $900 more in annual wages than students of teachers with less experience than average. Being in a class of 15 students instead of a class of 22 increased students’ chances of attending college, especially for children who were disadvantaged . . . Children whose test scores improved to the 60th percentile were also less likely to become single parents, more likely to own a home by age 28, and more likely to save for retirement earlier in their work lives.

I haven’t seen the paper either. $900 doesn’t seem like so much to me, but I suppose it depends where you stand on the income ladder.

Regarding the multiple comparisons problem: this could be a great example for fitting a multilevel model. Seriously.

9 thoughts on “The $900 kindergarten teacher

  1. At the risk of opening up old wounds: "Q: How much is a good kindergarten teacher worth? A: If you have to ask, you can't afford one."

    (c.f. http://www.stat.columbia.edu/~cook/movabletype/ar… )

    In any case, I've heard about this new Kindergarten result a few times (on the radio, etc.), but I suspect it just means if you have certain positive outcomes/advantages in Kindergarten that correlates well with certain other advantages that will help you for the rest of your life too. E.g. supportive parents, a community that values education, easy access to nutritious food, and so on.

    -Ken

  2. ^ It's the same study. I don't know why multiple comparisons would be an obvious problem – income is exactly what economists want to look at.

  3. I agree with John. It's not clear why this is a multiple comparisons problem. The obvious first choice is income vs randomized class size. They find positive effects there. They then look at all the standard teacher characteristics, given randomized teacher assignment, and find pretty uniform positive results.

  4. Ken: They are relying on a experiment that assigned peers, teachers and class size randomly in kindergarten. This gets confused because some reporters also summarize their descriptive results.

  5. I have seen the paper. They use the randomization in the Tenn. STAR experiment plus the universe of IRS tax records to look at the earnings of students randomly assigned to better/worse classrooms. The identification is credible, and there is no multiple comparisons problem: income is income. They also show the plausible channel for the effect: non-cognitive skills that are not picked up by later tests.

  6. This sounds credible. I teach in the public school sector and can tell a dramatic difference between students who've started in the school system and students who have moved into it. I'm actually working with the school system now to validate some predictive assessments. I believe my next project will be to find a significant transfer age for students graduating/not graduating in the school system. I'm sure there's some signficance with those who've received scholarships etc.

  7. The problem, I suspect, is that they are defining a "good kindergarten teacher" using a value added method. If the average child in the class went up ten percentile points in test score from the beginning of the year to the end of the year, that teacher is defined as a good teacher and the success in life of those students is attributed to the teacher.

    Perhaps, but I wonder how much randomness is involved. We're talking about measuring students as they enter kindergarten. How accurate is the test? How much random flux is there in the score? We're dealing with young, pre-literate children. Test results aren't that stable or predictive the younger you go.

    They then test the kids again at the end of the school year. Say a kid scores at a higher percentile. Did the teacher accomplish that? Or do we now just have two tests of how smart the kid is, and we should probably weight the second one more heavily because the kid is older and more mature? If the kid scores higher on the second attempt, isn't it likely the first score was something of an underestimate?

    I suspect that both teacher quality and more accurate test results play a role here. I don't know how to divvy up their weight, however.

  8. Well, anecdotally, from my experience with my kids in elementary school over the past several years, most testing is a joke and close to meaningless. Most "achievement" tests consist of 20 to 25 multiple choice questions. So, yes, there is probably a great deal of randomness involved, especially because they actively teach test taking skills and try teach kids how to make better guesses. What's worse is that it's not just teacher effectiveness that gets measured with some of those tests, important decisions are made about what "track" to place children on or whether a child qualifies as 'GATE'. At least a teacher being measured in this manner benefits from 20 to 30 different students being factored into her score, the kids get no such benefit.

    By the way if you want to have fun, try to convince an elementary school teacher that the conclusions he or she has drawn about her students based on a 20 question test might be completely wrong. That an 'A' student might have had a bad day and only gotten 15 out of 20, while a bad student might haven been unusually lucky and scored 19 out of 20.

Comments are closed.