Education could use some systematic evaluation

David Brooks writes:

There’s an atmosphere of grand fragility hanging over America’s colleges. The grandeur comes from the surging application rates, the international renown, the fancy new dining and athletic facilities. The fragility comes from the fact that colleges are charging more money, but it’s not clear how much actual benefit they are providing. . . .

This is an unstable situation. At some point, parents are going to decide that $160,000 is too high a price if all you get is an empty credential and a fancy car-window sticker.

One part of the solution is found in three little words: value-added assessments. Colleges have to test more to find out how they’re doing.

I agree with that last paragraph. Eric Loken and I said as much in the context of statistics teaching, but the principle of measuring outcomes makes sense more generally. (Issues of measurement and evaluation are particularly salient to statisticians, given that we strongly recommend formal quantitative evaluation in fields other than our own.)

I don’t have anything to add on the substance (beyond again expressing my agreement on the desirability of empirical measures of student performance) but I do want to hypothesize on the sources of Brooks’s doomy impressions.

After all, on first impression, top public and private colleges and universities are doing well, and there’s a lot of demand for their services. Just for example, when I was growing up in the D.C. suburbs, it was my impression that lots of students who were in the middle of the pack academically in high school could graduate and go to the University of Maryland in nearby College Park. In the decades since, the University of Maryland has become more competitive, reflecting the increasing demand (not matched by increasing supply) for high-quality college education.

So why is Brooks so sure that universities are in trouble? Why paint their current success as an “it’s always brightest just before the dark” situation (to borrow the words of Jim Thompson) rather than a more conventional presentation of universities as a shining success?

I see three reasons. I don’t know that any of these are conscious on Brooks’s part, but I suspect they all went into his reasoning.

1. Politics: Brooks is a political conservative. Universities are bastions of liberalism, thus it is pleasant of Brooks to see universities as struggling institutions in need of radical change.

2. Personal experience: Brooks has worked all his adult life at newspapers. Newspapers twenty years ago were where universities are now. Newspapers were making tons of money (from the business press, I recall that “Wall Street” was demanding 15 percent annual returns, and newspapers were delivering), in many cases they were close to local monopolies (consider the Washington Post) and successful both financially and in the sense of doing their job well (delivering lots of news), but there was a sense that this was all going to end. Not too many people were starting new newspapers, which was a bad sign, and many people were (correctly) worried that the social and economic basis for newspapers was disappearing. We no longer need to spend that quarter a day to read Peanuts, Art Buchwald, and the story of the latest Redskins game, and advertisers can reach us from other directions.

In short, Brooks has seen it happen in slow motion right in front of him, so he’s primed to see the same pattern of gradual and then panicked decline elsewhere.

3. Economics: Universities are indeed doing well intellectually and financially, but much of that comes from government support. Consider three leading sectors of the economy in the past twenty years: education, health care, and government (including the military). What do they all have in common? Guaranteed or near-guaranteed flows of tax money.

Just for example, one small contribution to the prominence of Columbia University is my research and this blog. I have time to do all this (rather than, for example, spending 40 hours a week grading papers) partly because I have millions of dollars of government grants. (And the granting agencies give extra funding to the university, so my grants also helps support the work of my Columbia colleagues who are not externally funded.) I think this is a good use of tax dollars—but of course I’d say that, just as Gen. Ripper supports the use of taxation to pay for expensive bombers. My point here is not to argue the merits of the case, just to point out that much of the financial success of universities relies on public funding for research, student loan guarantees, etc.

14 thoughts on “Education could use some systematic evaluation

  1. Option 4:
    David Brooks finds a bunch of articles that (he is not sufficiently trained to evaluate, no offense) and bombards his readers with this information and makes a decision that suits his interest or will get clicks.

    And this is coming from an avid Brooks reader.

  2. Expanding on your second point: Technology. Early days, but it seems to me that projects like http://www.udacity.com/ represent a significant threat to fixed-location, elite institutions, at least in some disciplines. The NYT appears to be following the story closely, I’m sure with their own business interests in mind– http://nyti.ms/Ai9B07. I also thought the article in Wired was pretty good: http://www.wired.com/wiredscience/2012/03/ff_aiclass/.

    • Dale-

      I doubt it. Unless udacity and friends come up with an equivalent means of testing mastery of material, these online schools won’t threaten the elite institutions. The obvious example is higher math: You can read pure math, but it’s not the same as actually doing it and being examined on it. Multiple choice is not as good as asking students to derive a solution by hand. I think this would hold for all math starting as early as Lin. Alg.

      Also I start to feel a little uncomfortable when it’s reported that “elite education” can be found simply online. The experience of going to a grueling “elite college” seems like it cannot be sufficiently well replicated online.

  3. Those of us in public universities (hello, California!) would dispute the “guaranteed or near-guaranteed flows of tax money.”

    The challenge with assessment of higher education is that curricula vary enormously (in contrast to high school). How are you going to test the relative learning of a pre-med, a business major, a graphic design student, and a physics major? Given the widely divergent goals of those majors, does it even make sense to?

  4. Is there evidence that (lifetime wage premium for a college degree) – (size of student loan debt) has dropped over the last few decades? That would be a good reason for worry, since one interpretation would be that the universities are moving from education to rent-seeking. Yes, he didn’t say that directly, but from the discussion among conservative economics bloggers for the past couple of years, it seems to be widely believed, and I’m sure he’s got that idea in the back of his head when he writes about the state of higher education.

  5. The three little word “value-added assessments” frighten me a great deal; in particular, because I know how hard this is to do correctly (even without the inevitable political overlay). In particular, one would need to start by defining what the value of the education was: and it will almost certainly include difficult to assess concepts such as motivation, curiosity, and creativity. Even if we stick to content domains this can be difficult. Take introductory statistics as an example: do we require an understanding of credibility intervals or just confidence intervals? Is Bayes theorem taught in the intro stat class or only in more advanced classes? How much emphasis should be placed on identifying and dealing with outliers? It would be extremely difficult to come to a consensus on any of those; the process in K-12 education (which is much better defined) has taken decades.

    Once you define the “value”, the assessment part is a lot of work in and of itself. This is especially problematic for small classes, as many of the statistical procedures used to calculate the reliability and validity of an assessment require large sample sizes. This might be possible for the intro stat course, but would be difficult for my seminar on Bayesian networks. It becomes even harder when you consider subjects like Bassoon performance, which is just hard to quantify.

    The third problem is that “value-added” is usually made by a conditioning model where different modellers use different choices of conditioning values to determine the baseline value. The choice of model can make a big difference in the estimate of the “value-added”; they also tend to be unstable from year to year. Is it really a good idea to base high-stakes decisions on models with such high levels of uncertainty? Another problem, is that the estimates coming out of these models are often called “teacher effects” when really they are classroom effects and there is a large peer-to-peer and teacher-to-peer effects going on. Sorting that out takes a complex model, which again leads to model uncertainty.

    The experience with value-added modeling in K-12 has not been uncontroversial. There has been a tendency to focus on things that are easy to measure (Reading and Math) rather than subjects that are hard to measure (e.g., Art and P.E.) or non-academic skills that may be as important as the academic ones (e.g., motivation and study habits). Even when there has been acknowledgement that the assessments are not measure the entirety of the construct of interest, there has been a tendency to emphasize the things that are easy to measure over the things that are important. A second problem is that producing high quality measures is expensive and that expense for the more part has been diverted from funds that would have otherwise gone to direct instruction.

    It is also not true that university education is not evaluated. I remember going through a large and painful process of reaccreditation last year where we documented a large number of things about our academic program. It was a lot of meta-work: work that is about the real work rather than the real work. A certain amount of meta-work is good in that it promotes introspection which leads to improvement. But it can reach a point at which the meta-work becomes more time consuming and important that the work itself and it collapses into itself.

    Finally, as has been pointed out elsewhere (I can’t find the link) that much of the rise in cost in tuition at public universities has been mostly offsets for cuts in the tax supports to universities. I think a major problem here is that our educational methods don’t have economies of scale. If I have larger class sizes, I need to spend more time grading papers and less doing other work (like the research necessary to support graduate students). Thus, to serve more people, we need more seats in the universities, but the money required to hire new faculty just isn’t there.

  6. Why are American and English universities so much more prestigious today than Continental European universities, when 100 years ago French, Italian, and (especially) German universities were comparably famous? Why do rich families in China want to send their scions to Harvard or Oxford instead of to Gottingen, Parma, or the Sorbonne?

    I think the most fundamental reason is: because we won the War.

    There’s a more subtle reason inside the general prestige issue of who won. The famous old Continental universities, as bastions of elitism, were neutered in the name of anti-Fascism after WWII, turned into giant open-admissions non-elitist institutions. Because American and Britain won, however, we kept our elitist institutions and only made them even more elitist. Practically the only famous American college to go the Continental route and convert to open admissions during the 1960s was CCNY, and we all know what a disaster that was.

  7. “The exact numbers are disputed, but the study suggests that nearly half the students showed no significant gain in critical thinking, complex reasoning and writing skills during their first two years in college. ”

    The value added testing that Brooks is talking about is done on things that that the colleges don’t explicitly teach. Students still learn the actual content of the classes taught. Brooks is really complaining that colleges don’t improve on student’s g-loaded skills. He is setting the schools up to fail.

    • I’m not sure I know what “critical thinking” in abstract means. I might be able to define a construct of critical thinking in statistics, which discusses ways we reason about data (e.g., how the logic of a hypothesis test, or estimation works). I’m not sure critical thinking is possible or even useful to define in the abstract.

Comments are closed.