Every year, the best players (or at least many of the best players) from Major League Baseball’s American League play their counterparts in the National League in the All-Star Game. They played last night; the American league won in the 15th inning. Here’s who won, from 1965 (when I was born) to the present, with 1965 at the left and 2008 at the right.

NNNNNNNNNNNNNNNNNNANNANAAAAAANNNAAAAATAAAAAA

The “T” indicates a tie (in 2002): unlike regular games, there is no requirement that the All-Star Game continue until somebody wins, and pitchers are reluctant to pitch too many innings and potentially hurt themselves.

I was born into an era in which the National League won every game. Now, the American League wins (or, at least, doesn’t lose) every game. This is happening in a sport where even bad teams beat good teams occasionally, so it’s really mystifying. It would be possible to explain a small edge for one league or the other, that persists for a few years — the league with the best pitcher will have an advantage, for example, and that pitcher can play year after year — but these effects can’t come close to explaining the long runs in favor of one team or another. Predicting next year’s winner to be the same as this year’s winner would have correctly predicted 80% of the games in my lifetime…and that’s if we pretend the National League won the tie game in 2002. (If we pretend the American League won it, it’s 84%).

What would be a reasonable statistical model for baseball All-Star games, and why isn’t it something close to coin flips?

1. Matt says:

It's fun to pronounce "NNNNNNNNNNNNNNNNNNANNANAAAAAANNNAAAAATAAAAAA".

Sorry, not at all productive. I'm not very good at time series.

2. John S. says:

Did the DH rule have something to do with this?

3. See WSJ article of July 11th. Not a statistical model, but some things to consider …

They play the same game. They pick from the same pool of players. For some reason, though, they don't get the same results.

By just about every measure, the 16 teams in Major League Baseball's National League are inferior to the 14 in the American League. The AL has won 11 of the last 16 World Series, including three of the last four. The annual All-Star Game, to be played Tuesday, has practically become a farce: Not counting a 2002 tie, the AL has won 10 straight.

Since baseball began interleague play in 1997 — where teams from the two leagues play a handful of regular-season games against each other — the AL is increasingly dominating. This year has been the second-most lopsided ever, with the AL winning 59% as of Thursday afternoon.

The plight of the NL seems rooted in a chain of events that began in 1973 when the AL adopted the designated-hitter rule — which allows for the pitchers to be replaced in the batting order by a full-time hitter who doesn't play in the field. The disparity was spurred by new ballpark construction; an unprecedented crop of young power hitters who, for various reasons, almost all fell to the AL; a series of disastrous trades and free-agent signings by NL teams; and a tradition of innovation in the AL that began in the mid-1990s with the Oakland A's….

4. efrqiue says:

Advantages will tend to stick around not just because careers of individuals will be long term, but because teams that perform well will attract more things that will tend to continue to help them perform (like stuff that generates income, which improves the resources available in all kinds of ways).

Also, demographic trends that help particular teams will tend to continue for long periods.

Without these effects one might imagine the process to be like a binomial whose logit(p(success)) moves something like a random walk, but with them I think there's a lot of forces pushing the probability away from 1/2 when it's not really close to the ends.

(when it does get too uneven for too long, there will tend to be some countervailing pushes that will work the other way though)

5. prisonrodeo says:

The only bit of this that makes any sense is:

"…and a tradition of innovation in the AL that began in the mid-1990s with the Oakland A's…"

Teams prep for their division and league rivals. I believe that moneyball made a big difference in the relative quality of the two leagues.

Admittedly, it's not a statistical model, but we're talking about history here…

6. Barry says:

I live in Ann Arbor, and see a recurring example of this. When the University of Michigan dominated Big 10 football, it was standard to go to the Rose Bowl, and get soundly beaten. A guy from down South once figured that it was simply due to the Big 10 not being a good conference compared to the Pac 10.

This would lead to two things:
1) Being the best team from the Big 10 isn't a good predictor of being better than the best from the Pac 10.
2) (possibly the most important) The best team from the Big 10, when playing the best from the Pac 10, would suddenly be playing a competitor for which they are not prepared. And given one game, there's very little time to adjust.

Given that the World Series string above, it looks like something changed in the mid-1980's, to end the National League's 18-year string.

7. jfalk says:

To answer your question: the model that explains this is any model with substantial autocorrelation of the errors.

8. Phil says:

Even the worst team in either league can expect to win 1/3 of its games against the best team in its league, so to suggest that the Americans dominate the Nationals due to a skill disparity in the players is to say that the gap between (the best players in the American League) and (the best players in the National League) is far bigger than the gap between the best team in a league and the worst. I'm not buying, nor even renting, this explanation. In fact, there has been inter-league play since 1997, and the record over that period is 1,387-1,317…in favor of the American League, yes, but only by 51%-49%. In short, we know that the _teams_ of the American League are very close to the _teams_ of the National League, which is hard to square with the suggestion that the best _players_ of the American League are vastly superior to their National counterparts.

9. jfalk says:

Sure, Phil, but teams play three game series in which they have the ability to maneuver their pitching staffs for maximal advantage, etc. Indeed, superior teams will often not put their best team on the field in order to maximize some future advantage. Bad teams, for example, try to avoid putting their best pitcher against a good team's best pitcher, for example, knowing that they can match hm against a somewhat worse pitcher later or earlier in the series, raising their probability of winning at least one game while hurting their (already poor) chances of a sweep. I don't think you can draw the inference you're drawing from three game series head-to-head competition. Suppose, for example, that the World series consisted on a one game playoff matching the best team against the worst team. Don't you think your 1/3 figure would be overstated?

10. Phil says:

jfalk, the short answer to your question "Don't you think your 1/3 figure would be overstated?" is "No, or at least, not by much." I think if the worst team in the league plays their best pitcher and fields their best players, and the best team in the league does the same, there is approximately a 30% chance that the worse team will win.

And although I've focused on win-loss record, we have a lot more data available, including the scoring in each game. In the 12 years since the last National League victory, there have been a lot of close games. Here are the scoring margins and scores of the seven closest games, thus covering more than half of the last 12 years. (And by the way, the Nationals won all three games before the start of the American run, the last of which was a 6-0 shutout).

0: 7-7
1: 7-6, 3-2, 5-4, 4-3
2: 3-1, 7-5

These scores just aren't consistent with the American League being dominant over the Nationals to such an extent that victory is nearly assured. For cryin' out loud, this year's game went 15 innings, with the score tied 3-3 for most of it — you can't really claim that the game couldn't have gone the other way (or that the Nationals couldn't have won the 7-7 tie or any of those other 1-run games)! If you think the American League is that much better than the National, then how do you explain that more than half the time, they win by 2 runs or fewer?

I seem to be arguing that the coin flip model is right, and that the long runs (first in favor of the Nationals, now the Americans) just represent a hugely improbable occurrence, but that's not what I'm saying: I think there is something that is causing the odds of an American League win to be higher than a National win. I just don't think it's the skill of the players. I think it's probably autocorrelation — recall that before the American League was winning all the games, the National league was winning even more of them! I don't think there's an intrinsically much higher win probability for the American League. Maybe the umps take bribes, maybe the league that has been losing (the American when I was young, National now) wilts under pressure, I dunno, that's why I'm asking. But it's not just player skill, of that I'm sure. The American League might be better, but they're not THAT much better.

11. luke says:

the DH rule might factor in, in the sense that the pitchers from the national league are under more strain in the regular season.

my inclination, however, is money. i don't have all the figures handy, but i know the red sox and the yankies (american league teams) are the two highest spending teams in MLB. in that case, they tend to buy the best pitchers, best hitters, and such.

12. Ubs says:

I agree with Phil. Disparity in talent level isn't enough to explain the streaks. Unless it's just a huge coincidence, there's something else going on that we haven't identified. I think we need to look at ways in which the All-Star Game isn't like a normal game.

A few observations:

(1) The All-Stars really aren't the best players in their leagues. The procedure for selecting players gives a lot of weight to popularity, with various weird voting biases. Even the players who are chosen don't all play. A few might decline for one reason or another.

(2) The teams in the All-Star Game don't try as hard to win as in a regular game. This is more true on the team level than the individual level. There's a lot of non-strategic swapping of players in and out for the sake of giving everyone a chance to play.

(3) The National League has 16 teams to the National League's 14. On the one hand, this makes for a larger pool to draw from, which you might expect would yield a better collection of top players. On the other hand, the requirement that each team provide a player means the NL will have a slightly larger number of players who wouldn't have made the team otherwise.

(4) The roster in an All-Star game is considerably larger than in a regular game. Combined with the desire to give everyone a chance to play, this means each player plays a lot less. Among other things, this makes a DH type player more valuable than in the regular season. Even if the home field is an NL park, no pitcher is ever going to hit, and plenty of hitters will get only one at-bat and never have to play defense.

13. Josh says:

If you think about the way All-Star games generally go, the starters are usually all gone by the 5th or 6th inning, and the late innings are played by the second or third teamers on each side. If one league is two or three deep with real stars and the other isn't, then the difference in player quality during the crucial innings could be very large. The same goes for the comparison between the 6th, 7th, 8th best pitchers in each league.

In addition, in regular season games teams are much more likely to stick with a shaky pitcher. Even if it might mean a lower chance of winning that one game, the cost of burning through the entire bullpen is too severe. This isn't a factor in All-Star games. You're thus much less likely to see big innings, probably the single biggest (somewhat) random element favoring the inferior team.

Add all that up, and I don't think it's too hard to believe that player skill would be much more dispositive in All-Star games than in the regular season.

14. Eric says:

Maybe it's the parks?

15. JohnP says:

I'd love to say it was binomial but I think it's more likely that finances and closers are the disparities in the All Star game.

Take a pitcher like K-Rod who had noticeabliy lost mph on his fastball. It's in his best interest to sign with an NL team. At this point the best closers are in the AL, because the AL needs them and is willing to pay more for them.

The closer's role now represents a serious shift from late 60's early 70's baseball. At that point in time no one was paying a one inning pitcher 7figures, let alone the 8 that some AL and NL teams will pay now for a closer. But these guys are sorta worth it. They come in, and for the most part are successful at shutting down the opposition. I'm not saying the NL closers are horrible. I beleive the All-Star orders are relatively similiar. I contend that the late inning rotations are better now and because of this the AL is better suited to win games even if they are close.