Grade inflation: why weren’t the instructors all giving all A’s already??

My upstairs colleague Blattman writes:

The trend is unsurprising. Schools have every incentive to move to the highest four or five piles [grades] possible. . . . Then grade inflation will stop because . . there will be nowhere to go. . . . So why resist the new equilibrium?

I don’t have any argument for resisting, but I don’t think everything’s quite so simple as Chris is saying.

First, you can easily get compression so that almost everyone gets the same grade (“A”). So, no four or five piles.

Second, the incentives for grading high have been there for awhile. To me, the more interesting question is: how is it that grade inflation hasn’t gone faster? Why would it take so many decades to reach a natural and obvious equilibrium? Here’s what I wrote last year:

As a teacher who, like many others, assigns grades in an unregulated environment (that is, we have no standardized tests and no rules on how we should grade), all the incentives to toward giving only A’s: When I give A’s, students are happier and complain less, I get to feel like a nice person, and I give my own students (whom I generally have somewhat warm feelings toward) a benefit in their future lives. Back when I used to organize a class with several different section leaders, each instructor wanted to give his or her students higher grades. We had common assignments and a common final exam; even so, each instructor had a reason why his or her students deserved some exemption from the grading cutoffs.

So the real question is, why have grades been going up so slowly? I assume that back in the 1940s, a prof couldn’t really just give all A’s to his or her classes: someone would probably notice and say something. But now we really can, and it’s been that way for awhile.

The fact that profs don’t give all A’s, even though they can, is interesting to me. My explanation for this behavior is as follows: college professors typically got high grades themselves in college. Getting high grades is part of how we defined ourselves when we were students. So, now that we’re giving out the grades, we don’t want to devalue this currency. It’s not a matter of self-interest–if I give out a bunch of A’s to my students, it’s not going to retroactively tarnish my college grade-point average. Rather, I think it’s just that profs see grades as important in themselves. Sort of like rich people who don’t want to debase the currency, just as a matter of principle.

I remember looking at grading records for undergraduate classes back when I taught at Berkeley in the early 1990s. There was lots of variation in average grades by instructor, even for different sections of the same class. I didn’t do a formal study, but I remember when flipping through the sheets that average grade seemed to be correlated with niceness. The profs who were generally pleasant people tended to give lots of A’s, while the jerks were giving lower grades. Again, no standardized tests so no way to judge whether the average grades were informative, but I doubt it.

At the institutional level, these problems with grades would be fixed using standardized tests or with some sort of statistical correction such as proposed by statistician Val Johnson, who writes:

There are two approaches that might be taken in reforming our grading system. The first is to encourage faculty to modify their grading practices and adhere to a “common” grading standard. The second is to make post-hoc adjustments to assigned grades to account for differences in faculty grading policies.

The beauty of Val’s approach is that it does three things:

1. By statically correcting for grading practices, Val’s method produces adjusted grades that are more informative measures of student ability.

2. Since students know their grades will be adjusted, they can choose and evaluate their classes based on what they expect to learn and how they expect to perform; they don’t have to worry about the extraneous factor of how easy the grading is.

3. Since instructors know the grades will be adjusted, they can assign grades for accuracy and not have to worry about the average grade. (They can still give all A’s but this will no longer be a benefit to the individual students after the course is over.)

36 thoughts on “Grade inflation: why weren’t the instructors all giving all A’s already??

  1. Another, rather different approach: Don’t have grades, just a pass/fail. A pass/fail line is much easier to define, hold and compare than a number of grades above the pass line. Most of my courses as an undergraduate in Sweden where pass/fail only, or with an extra “pass with distinction” that didn’t actually make a material difference to your transcript.

    • Disagree. The concept of passing or failing seems more subjective than saying “you scored in the bottom 2/5 of your class, therefore you get a C.” Also, for me it is personally difficult to assign a student a failing grade, suspect same for others.

  2. In my ideal teaching world, each student is in the class in order to learn as much as he or she possibly can, and I am there to help. There are no grades at all. If a student decides not to learn anything, that’s just his or her loss. I assign homeworks and give exams only to teach the material, not to test it.

    That’s the last paragraph from my grading policy, at http://www.cs.umb.edu/~eb/114/grading.html

    • That’s an ideal situation for a teacher, you take no responsibility at all for the learning.
      But what about an employer who is expecting that a graduate had actually earned how to read, write, compute and think?

  3. Andrew,

    The link before the last one is not working. I think it should be this:
    http://www.dukechronicle.com/article/johnson-questions-sincerity-grade-inflation-discussion

    It is interesting to read that variation across sections of the same course happens in places such as Berkeley.

    I also like the first approach suggested by Val Johnson. I may be naive, but creating a grading standard (rubric?) and then performing some kind of quality assurance (e.g., a peer reviews a random sample) could work. Although, it is likely that the statistical correction may be cheaper.

    Francisco

  4. Not an expert, but in Canada, there often seems to be the opposite challenge of stopping Profs from giving low marks especially in the first year. There is one university, in particular, where more than a few alumni dissuade there own kids from going there for that reason (the shock of 1st year mark deflation).

    Maybe the students were not mature enough to accept an important improvement that Val made available. Maybe Val should have taken the methods to funding agencies who surely will put better allocation of scarce research resources above their organizatioanl/personal concerns ;-)

  5. I suppose this depends heavily on the idea that profs will respond to incentives and not be over-generous with grades. Otherwise, there’s the issue that grade compression is “lossy,” i.e., if the only grades handed out are A and A-, it is difficult to deflate the grades using any automatic scheme, as presumably _some_ of the A’s should actually have been A’s. And it isn’t obvious that grade compression is always bad; for instance there are various upper-level electives, special topics courses, etc. where the students self-select, and because of the self-selection there isn’t _really_ a wide distribution of abilities, they’re all peaked at the high end. (So there might be a perverse incentive 2′, for good students to take courses where they’re much smarter than average, and/or deliberately seek unpleasant profs, to maximize the chances that their A’s won’t be downgraded.)

  6. Grade inflation is very easy to fix. Both my undergraduate institution and the business school where I TAed during my PhD have a strict rubric. X + or – a few percent As, + or – a few percent B, etc. Very small classes have some limited leeway.

    • That doesn’t “fix” anything – a percentage score is meaningless without knowing the difficulty of the questions, even on something like a math test where the assignment of marks is fairly unambiguous.

  7. There is a pretty simple solution to the grading inflation problem: report not only the grade, but also the median grade for the group (either for the current course or over several incarnations, if the course has been roughly the same for a while).

    If you always report both, as if they are glued together, this is an incentive for both teacher and student to counteract or avoid inflation. Suppose you see that someone had an A/A, this looks very different (and less impressive) than an A/C. With an A/A, everybody knows that the teacher gives everybody a high grade, and also knows that the grade contains no real information about the student. (NB: Precisely for those reasons, both the students and the teachers are usually against this procedure.)

    • This is actually a really awesome idea! If this became standard, ambitious students would gravitate towards the hardest classes, because an A/C- would look so impressive on their transcripts. Another advantage is that though this system seems novel, anyone can understand a final transcript with an overall 3.74/2.87 GPA. If I were an administrator, I’d really be pushing to standardize this sort of grade reporting.

      • Except that if the hard class is only taken by the best students, they’ll all score A/A and will look the same as the “gimme” class down the hall. All these forms of grade normalization are fine as long as your populations are the same, but get demolished by selection bias for everything beyond the general classes that everyone takes.

    • Columbia actually does something like this with some classes on its transcripts. They report the percent of the class receiving A-, A and A+.

      So you might see something like:

      A/25% (well done! only a quarter got As)
      A/ (blank, this class doesn’t report)
      B/50% (ouch, half the class got As, but not this person)
      A-/16%

    • The notion that somehow our grade distributions should correspond to a normal curve is unfair and discriminatory to both students and teachers. It takes no account of the varied populations of each course, nor of the teacher’s commitment to innovative and effective pedagogical methods – including, for example, the use of clear behavioral objectives, active learning techniques, rubrics, and criterion grading. Some teachers are actually committed to student success and do things to ensure it. Why should these teachers and students be punished because they are in a class that is widely successful. It is high time that professors who take pride in handing out grades on a normal curve are called to account for their role in failing to help all students learn. Admittedly, students are ultimately responsible for their own success, and they sometimes fail in spite of our best efforts. But sometimes high rates of success are the mark of excellent teaching and support, and should be recognized and emulated, not denigrated with arbitrary statistical measures. Otherwise, what’s a teacher for? (A computer can hand out grades on a normalized distribution.)

  8. Not all of this is necessarily grade inflation as the student population, competition for seats, grading of peers, and depth and range of knowledge, has all changed considerably over the years.

  9. “Sort of like rich people who don’t want to debase the currency, just as a matter of principle.”

    But of course rich people don’t want you to debase the currency because it would devalue their pile of money. A better example would be poor people with fixed-rate debt who still dislike inflation because of the principle.

  10. Personally, I think the letter-grade boundaries are more of a problem. You get this odd effect where only a few marks separate a first from a 2:2, and grading tends to cluster at the thresholds. I’ve never seen why we give letter grades or degree classifications rather than numbers. If you really are worried about distinguishing different A-grade students, well, Ms. 91 and Mr. 69 + exit velocity are obviously different.

    In the UK, there’s also a gap between how the letter grades are constructed and how they are used. In theory, a GCSE or A-level is meant to be a qualification with varying grades, and this is how they are statistically reported. In practice, anything less than a C might as well be a fail. Similarly, a third or, god forbid, a pass degree might as well be a fail. (To be honest I think I’d be more ashamed to get to the end of the course and get a terrible result than flunk out – it suggests you didn’t even realise how badly you were doing.)

    • So many posts on this blog presume detailed understanding of U.S. circumstances (not to mention New York or Columbia U) that it’s bizarre to see a post that presumes detailed understanding of U.K. university degree classes and secondary school (high school) examinations.

      For those curious, a little explanation may help. The rest of this post assumes reference to the U.K. and more specifically England.

      1. GCSE is an examination widely taken at age 16 (sometimes earlier). A-level is an examination less widely taken at age 18 (sometimes earlier), especially by most of those with aspiration to university entrance.

      2. British universities use Honours classes, which are (from the top) 1 (First), 2:1, 2:2, 3 (Third). Below that there are pass or ordinary degrees, and people who leave the university system with nothing, or possibly a Certificate or Diploma. This system goes back to the 19th century at least.

      3. It is generally true now that individual pieces of work are marked on a percentage scale, so the degree classification is a reduction of lots of percent marks. Typically the outcome is that (provided nothing is Failed) an average of 70+, 60+, 50+, 40+ maps to First, 2:1, 2:2, Third. However, there are variations between universities beyond that in how far how any lousy marks are ignored or downweighted and also complicated rules typically kick in to the benefit of students with results near the margins. Thus someone with 69 average but better marks in final (usually third, sometimes fourth) year than previously would be regarded more favourably than the converse (this is what “exit velocity” means).

      4. Typically a student is provided with a transcript with their individual module percent marks, but this remains private. So that is finer detail than a single degree class, typically about 6 marks for each year of study.

      I think the best explanation for why a percent-based system is not universal is that only recently have many humanities and qualitative social sciences moved fully away from a letter grade system for individual assignments. I refer to marks assigned during a course. I encountered (or think that I recall) a few decades ago systems in Britain that started from A, B, C (or alpha, beta, gamma) but with one or more of the following extras

      a. borderlines are allowed and (e.g.) B/A differs from A/B

      b. minor nuances allowed, indeed expected, with one or more pluses or minuses

      c. uncertainty could be signalled by ?

      Thus the geography teacher who once gave me B++(?+) was ahead of his colleagues in physics and chemistry in being explicit about measurement error.

      In practice these grading systems were thus often just as nuanced as fully numeric systems.

  11. Pingback: Assorted links

  12. Does a statistical averaging penalize students wanting to take more interesting or difficult courses. If I know that a course is difficult or interesting, then it will be harder to get an A because smarter students will take the course. Conversely, if I know a course has few pre-reqs and is taken by many people or if I have a background that will make the course easier for me, I’d be better off taking that course and getting an A. As people will likely only see my GPA, then there is a real disincentive to taking a more challenging course.

    While that argument is usually bandied about by engineering or science grads to play up their major, the concern is very real. Why would a philosophy major take computational logic, as opposed to an easier philosophy course? Why would a math major take a sociology course, when she can take econometrics?

    I tend to favor the pass/fail, if only because it leaves the individual with the initiative to make the most of his/her education. A student can’t play games with courses and is forced to be creative in distinguishing himself/herself. Hopefully, there is a correlation between intellectual independence and the type, self-actualization and the ability to successfully draw attention to oneself.

    • “Why would a philosophy major take computational logic, as opposed to an easier philosophy course? ”

      Why should computational logic be a particularly hard course? It has very definite right answers, which always made that sort of course easy for me.

  13. > I didn’t do a formal study, but I remember when flipping through the sheets that average grade seemed to be correlated with niceness. The profs who were generally pleasant people tended to give lots of A’s, while the jerks were giving lower grades. Again, no standardized tests so no way to judge whether the average grades were informative, but I doubt it.

    It’s been argued for a while that the increasing bureaucratization post-WWII of academia, and the clampdown on tenure and pay in the last 2 decades, have had the effect of massively selecting for people who are high on Big 5 personality factors Conscientiousness & Agreeableness (and so, less for raw IQ etc). Because those who are low on Conscientiousness burn out on the drudgework from undergrad to grad to postgrad to (maybe) tenure-track & don’t churn out papers regularly, while the low on Agreeableness are bad at networking and self-sabotage by rocking boats.

    Could this explain it?

    It seems quite testable: first, look at current professors & adjuncts to see if the high on Agreeableness are also the easy scorerers; second, look for historical personality datasets for professors (they must exist) which also have surviving grade information permitting easiness inference, and see if there is both fewer highly Conscientiousness & Agreeable faculty and whether the correlation with easy grading survives.

  14. The nice professors often enjoy lecturing and their student learn more. The jerks give bad grades and hate to lecture. So it’s a feedback effect as well.

  15. I really like the idea of having either the medium or the distribution of grades on the transcript. I think it has multiple benefits. One, it helps counter the challenge students face in taking a more math based course with harder grades vs. a liberal arts course.

    Second, it should help those who are at the top of their class who studied liberal arts. There are employers who no longer look at B.A. applicants, but if someone can really show they are a standard deviation better then the rest of the B.A. pool, they may be able to break through that barrier.

  16. I went to a great uni for undergrad, but there was no grade inflation. Class averages were (and are) always reported together with the marks and generally ranged from C to B except in som higher level courses. However, entry cutoffs an scholarship or research positions never account for this. Getting B+ and A- across numerous different fields at a world class university is worth less to most cutoffs fr these purposes than As at a middling school that gives As to everyone. I think reporting class averages should be standard. Doesn’t stress me now because things worked out, but boy did this piss me off for a while because I was a single grade away from the minimum cutoff for lots of stuff that I wanted to do, and once you meet the cutoff you can say “see, microbology, political theory, environmental sciences, agriculture, chemistry, international relations, economics and more … these are the classes that the average came from, and I managed to work above average when working with the best” …. but no, a cutoff is a cutoff, where literally a single wrong answer on some test was the difference between meeting the cutoff of not. Unless you want to learn lots and have to compete hard, I would truly recommend going to a crappy school because fakely high marks will, for altogether too many purposes, do a lot more for you. No regrets though. Learning is so much more interesting when you’re surrounded by geniuses :)

  17. As a student at NYU Stern, we are graded on what they call the “Stern Curve”…

    Basically each course has a cap on the number of A’s that can be handed out at 35% of the students. That is at the MBA level, it is more strict for undergrads and in finance courses.

    As a student I appreciate this, because I don’t want to see students who put in half the effort that I do to get the same reward. However it does create a deflationary effect when our GPAs are judged against those from other business schools.

    • “A cap on the number of A’s that can be handed out at 35% of the students.”

      What a ridiculously arbitrary standard! Why not 36.7 or 33.3 or 39.2? The grade distribution under this system then, doesn’t reflect students’ real learning, but rather the “needs” of the grade distribution. Once again, the efforts of committed teachers to help everyone in the class excel are ignored and denigrated. And so are the efforts of students. You say “As a student I appreciate this, because I don’t want to see students who put in half the effort that I do to get the same reward.” But what happens when you put in the same effort as other students – and achieve the same result (it’s not just about effort, after all) – only to get a lower grade than the others because you’re beyond the arbitrary 35%! This makes no sense. Grades should reflect actual performance. When there’s a clear objective and a solid rubric for evaluating it, then anyone who achieves excellent performance should be recognized. And if everyone’s performance is excellent – or most, or a lot – shouldn’t all those who earned A’s get A’s???

  18. I try to make the case that at the graduate level, the only two grades should be A and Incomplete. One has either learned and (more or less) mastered the material covered in the course, or else one has not completed the work and does not deserve a grade.

    I do not often succeed with my argument.

  19. That declining period starting in the late ’70s exactly matches the years when I was being graded in school. My experience suggests the limit on grade inflation is the generational backlash that happens when a particularly cohesive cohort (Boomers) exits the system. They then “pull the ladder up behind them”, and reform grading while a more individualistic, less organized, group (Generation X, can I get a shout-out?) is getting grades. Then another gregarious generation appears, and inflation resumes.

  20. If we assume that, in the past several decades, we have actually learned how to teach students better, would we expect changes in the average grade awarded, even if we held our standards constant? I think grade inflation is a serious area worth studying, but I really don’t think the issue is as simple as many want to make it out to be.

  21. Speaking of non-sequitors, this is definitely not a uniform trend. Is it known (or knowable) why some of the shorter trends were seen? Was the very fast increase of the late sixties due to pressure to not have students fail and get drafted? The mild decline in F’s compared to the swapping of A’s and C’s argue against that. Was the decline in the 80’s due to cultural relativism wars?

  22. Pingback: Grade Inflation in U.S. Colleges | Decisions Based on Evidence

  23. I agree with Aaron about “having either the [median] or the distribution of grades on the transcript”. An alternative or supplement is to post the student’s rank in each course taken. If the professor gives 10 A’s and 5 B’s and does not make finer distinctions, each “A” would correspond to to a rank of (1-10)/15 and each B to (11-15)/15.

Comments are closed.