Will Tiger Woods catch Jack Nicklaus? And a discussion of the virtues of using continuous data even if your goal is discrete prediction

I know next to nothing about golf. My mini-golf scores typically approach the maximum of 7 per hole, and I’ve never actually played macro-golf. I did publish a paper on golf once (A Probability Model for Golf Putting, with Deb Nolan), but it’s not so rare for people to publish papers on topics they know nothing about. Those who can’t, research.

But I certainly have the ability to post other people’s ideas. Charles Murray writes:

I [Murray] am playing around with the likelihood of Tiger Woods breaking Nicklaus’s record in the Majors. I’ve already gone on record two years ago with the reason why he won’t, but now I’m looking at it from a non-psychological perspective. Given the history of the majors, what how far above the average _for other great golfers_ does Tiger have to perform?

Here’s the procedure I’ve been working on:

1. For all golfers who have won at at least one major since 1934 (the year the Masters began), create 120 lines: one for each Major for each year from the year the golfer turned 20 through the year he turned 49. Here’s the draft I use to explain this:

In trying to estimate how well golfers do after their mid-thirties, we’re not interested in the numbers of wins per attempts, but the number of wins divided by the numbers of majors that occurred during the remaining years when a professional golfer might plausibly win a major championship. Seve Ballesteros is an example of why this distinction is important. Ballesteros won five major championships, the last at age 31. Then he developed back problems, and his ability to win majors (or any other tournament) was effectively ended by the time he was in his late thirties. But golfers tend to develop physical problems as they age. The failure of a championship golfer to compete in major championships is an extremely good indicator of not being able to win if he had competed.

Therefore the data are based on the total number of majors that occurred from the year that the golfer turned 20 (when a golfer might plausibly win a major championship, given that Tiger Woods did so at 21) through the year that the golfer turned 49 (a plausible upper bound, because Julius Boros won a PGA championship at age 48). Operationally, this means that the database contains 30×4=120 lines for each subject over the entire course of his career. Years that were age-eligible but predate 1934 or postdate 2012 are deleted from the database.

For the analysis, I will look at the results for several subsamples. The one that appeals to me a priori consists of golfers who were born after the beginning of 1910 (a way of defining modern golf–it barely gets in both Hogan and Snead, the earliest golfers who intuitively seem to belong) and won at least two Majors (winnowing out the flukes). From now on, that’s what I mean by “the sample.”

2. Create a binary variable WIN for each line scored 0 if the subject did not win and 1 if he did.

3. Create a variable FLOORAGE that is the floor of the age of the subject at the time the tournament occurred.

4. Create SUCCESSRATE for each FLOORAGE. In Stata, I created this with

tabstat win if majors>=2 & born2>=3654,by(floorage) stat(mean sum count)

These were the results:

floorage mean sum N

20 0 0 159
21 .0061728 1 162
22 .0116959 2 171
23 .0282486 5 177
24 .0277778 5 180
25 .0552486 10 181
26 .0540541 10 185
27 .0597826 11 184
28 .048913 9 184
29 .0382514 7 183
30 .0621469 11 177
31 .0584795 10 171
32 .0982659 17 173
33 .0738636 13 176
34 .0594595 11 185
35 .0860215 16 186
36 .0376344 7 186
37 .0326087 6 184
38 .048913 9 184
39 .0434783 8 184
40 .0274725 5 182
41 .0222222 4 180
42 .0115607 2 173
43 .0180723 3 166
44 .0060976 1 164
45 .0060976 1 164
46 .0062893 1 159
47 0 0 154
48 .0065789 1 152
49 0 0 80

So among that sample, at age 36, we have an n of 186, 7 wins, for a SUCCESSRATE of .0376. I think my shaky grasp of probabilities tells me that the probability of one of these golfers winning at least one major at age 36 was 14.2%. But that grasp doesn’t extend to calculating the probability of winning an aggregate number of majors over several years. Specifically, in the case of Tiger, I’d like a way of expressing his relationship to the sample from age 20-33 (from his debut to the Thanksgiving catastrophe), and a way of expressing how far above the experience of the sample he has to perform to get at least 5 more Majors from now through age 49 (5 is the number he needs to break Jack’s record).

Any thoughts on how to do what I [Murray] have planned and what might be a better strategy would be appreciated.

Here are my suggestions, as a statistician:

1. Don’t just look at “majors.” There’s information in every tournament that a player competes in. By looking only at a small subset, you’re just reducing your sample size. To put it another way: it’s fine to make inference about that subset but it’s best to use all available data to fit the model that you’ll use to make those inferences. If you want, you can throw in a predictor for the importance of the tournament, although I doubt that will be necessary.

2. Don’t just look at win/loss. Instead I think it makes more sense to code the player’s position (rank among the winners, or maybe score compared to the top score in the tournament). Again, even if all you care about is predicting wins, you’ll do better to include more information in your analyses.

Points 1 and 2 (which are really just two cases of the same general point) are surprisingly (to me) difficult for people to grasp. Even respected researchers will, for example, study elections by looking at win/loss rather than vote share. There is a logical appeal to this sort of “reduced-form” model but all the logical appeal in the world pales beside the imperative of data.

17 thoughts on “Will Tiger Woods catch Jack Nicklaus? And a discussion of the virtues of using continuous data even if your goal is discrete prediction

  1. There is an interesting issue to your suggestion, Andrew. There are those who believe that (a) majors are different than other tournaments, and (b) winning is different than almost-winning. (a) is to some extent trivially true since the fields are considerably stronger than those for the average tournament, but it’s easy enough to adjust for that, as you mention, but I think such an adjustment is necessary and not at all superfluous; (b) is the “Johnny Miller” school of thought — that there are lots of guys who can score really well but lack the “intangibles” necessary to win. I’m not a proponent of this theory at all, but it’s why people model wins, not tournament ranks. Lee Westwood has been a top 5 player in the world by any reasonable measure for at least 7 years, but he’s “lost” 28 majors in that time. I think it’s just bad luck, but the Johnny Millers of this world disagree with me.

    The Murray analysis above is largely inadequate by taking a sample of golfers that won two or more majors. While this is a very talented group of golfers, it is not nearly as talented as a group of players that won ten or or more majors. Unfortunately, that group only contains three players, and one of them of Woods and another one didn’t win a major after 1929. I’m just not sure that including the great-but-not-godlike golfers who only won three or four majors tells you much about Tiger Woods.

    Overall, though, modeling major rank as a constrained variable with a distribution taken from some empirical dataset adjusted for aging effects sounds like a promising strategy.

    • Jonathan, can you give me a few cites for analogous analyses to help me into the literature?

      As for the sample, I will of course replicate the analysis for a variety of subsamples of Majors winners. But I’m not at all sure that Woods’s appropriate comparison group is Nicklaus any more. I think the changes in his psyche since 11/09 put him closer to the average of those who won a smaller number of Majors. At 36, Nicklaus was still making eerie proportions of 8-foot putts with everything on the line, something that Woods used to do at least as well, and that he appears unlikely to do ever again (the history of golf has vanishingly few cases of aging players whose putting stopped being magical and then returned to being magical).

      • Well, I’m not sure I have a great set of sites on this, but the recent Connoly and Rendleman paper http://mba.tuck.dartmouth.edu/pages/faculty/richard.rendleman/docs/WHAT%20IT%20TAKES%20TO%20WIN%20final_005.pdf gives a bottoms-up method for doing this that I find very persuasive. All you need to do (!) is append an aging function onto it and I think you could use this method. Closer to what I was proposing is Croson Fishman and Pope in Chance, 1988, comparing stability of rankings in golf and poker.

        More to the point, though, I think your psychological analysis of Tiger Woods is Kahnemann-Tversky pattern-matching at its most misleading. Tiger Woods hasbeen intensely studied, and the stories of his swing remakes and psychological problems are heuristically available and we have a strong tendency to make his success follow those narratives. i’m just not sure they exist statistically. Just to take a simple example — if you estimate a Poisson regression with major wins per year as the dependent variable (ignoring the fact that it can’t ever go higher than 4) and simply estimate a time trend, you get almost no significance and Tiger’e mean expected major wins decline by a gigantic 0.02 per year. (He’s now down to an expected 0.6 majors per year, down from the 1 major per year he had in 1995 in the Poisson model. Looking at that curve, I’d give him a mean of about 5 majors in the next ten years of golf. This analysis breaks Andrew’s rule of using more data rather than less, but when I tried a ranking model I got positive dependence on time, not negative! (but similarly insignificant statistically.)

        • Numerous slips in typing the previous comment (c’mon Andrew! implement editing!) but you’ll fins the Chance article in 2008 (Vol 21, #4 pp 25-28) , not 1998.

      • Hi Charles. I am an amateur statistician and sports enthusiast so I too think of these issues. I agree with all who say to not just focus on majors winners, although it really won’t make that much of a difference because each sample size is probably large enough. The reality, however, is that both Jack and Tiger have won about the same percentage of majors as they have over all tournaments. I would be very surprised if that were not true over all. That is, if we looked at the gross winning percentage (total tournament wins of all majors winners divided by total of all their tournaments) of all majors winners it would be statistically the same as that groups total majors winning percentage.

        I think your premise of Tiger’s mental state causing him to be less likely to win majors (as opposed to other tournaments) is virtually impossible to demonstrate, and I am more inclined to believe injuries and body wear is catching up to him as it does all golfers (even as I admit to thinking the same as you about his mental state). If you assume his mental state has changed just for majors, you might as well pick a number out of a hat to guess his probability of winning 5 more majors. If you assume his mental state has changed for all tournaments, then you also may as well pick a number out of a hat. You cannot model that which you cannot measure. The reason is you are presupposing an unseen exogenous event (his mental state’s change) which does not appear in your or any sample set. So if you want to model his probability, that must be ignored.

        What are the odds of Tiger winning 5 more majors, all after the age of 36? Ben Hogan won all 9 of his majors between age 35-40 (when you could only play 3 a year at times as British Open overlapped with PGA). He lost the famous “Jack Fleck” Open at Olympic at age 43 or so. Nicklaus won 5 between 35-40 and 1 at 46. Gary Player, one of 5 who won 9 majors after 1934, won 4 after age 35. I may be missing some, but I believe these are the only 3 guys to have won that many after age 35 (I am giving Tiger the benefit of the doubt). The point of this, is your problem is in the “Outlier Zone” and not subject to normal statistical methods, which are relatively plain vanilla to calculate.

        I believe you are making this way too complex. The odds of any golfer winning 5 majors after age 36 is simply 2 divided by the total number of golfers who ever played (adjusting for minimum number of appearances). When Ben Hogan was playing, his odds, based on history, was very close to zero—–except he was known as the best golfer not to have won a championship. But the Bayesian adjusted question becomes, what are the odds of a player, who has already won 14 majors, of winning 5 after the age of 36. Well, no one has ever been in that position before. Therefore, we must resort to something like the following question—-of all the players who won at least 5 majors in their life time, how many have won 5 after the age of 35? Fourteen have won 5 and 2 have won 5 after age 35. One has won 18 and he won 6 after age 35. Nicklaus and Tiger had very similar over all winning percentages at their peaks—always between 25-35%

        My point is we have enough samples in history so that we do not need to use relatively complex methods of inference. We have all the data. You can pick the odds, but I will say 50%.

        • Continuation of previous post—a more traditional approach.

          One could measure various rates of decline post age 36. But since Woods has the highest number of Major wins, still, at this young an age, whose rate of decline do you use as a comparative sample set? I could argue one should use the best golfers at the age of 36 as a proxy. What does rate of decline mean? One could go crazy here because there are many possibilities to create rate of decline standards. Is it winning percentage? Is it average score per round? Is it average score per round adjusted by course played? Average finish in a tournament adjusted for course? Who are the best golfers at age 36? Any golfer in any year who is 36 and ranked in the top 25? The top 50? Or is it all golfers? These are all judgment calls to make to try and make a prediction—-number of Tiger majors.

          Statistically, we need to estimate Tiger’s winning percentage and his presumed rate of decline (Hogen was best from age 35-40, so among the greats there are anomalies). We already know that Tiger’s projected rate of majors is very likely higher than 18, unless we assume an enormous rate of decline relative to other great players. Having said that, given his recent missing years, his rate of decline probably looks accelerated.

          The group I would compare him to are any golfer who was ranked in the top 25 for at least 3 years from age 36 on. I am making an assumption that the better golfers have a similar rate of decline—plus these golfers are likely to have played the longest. This will also capture guys who did well after 40 but poor in their late 30s and give us a large enough sample. I would then create an age/tournament finish regression with these golfers from age 36 on to retirement. We would expect to see this average finish to decline over time—(we could adjust by using same tournaments each year) and number of wins per year, and so on up to age 49. What we would measure is the absolute total number of wins over time for this cohort—rather than winning rate— to factor in early retirements due to poor performance or injury. We could use absolute number of top 10 finishes and compare that over time to get more data, but we are looking for wins and there should be enough data to make that meaningful. No reason not to do both. We also could have Charles resample Majors to test his hypothesis.

          For example, pretend 15% of all tournaments per year are Majors. That is about 26-27 tournaments overall to measure per year. That is about 400 total tournaments for a 15 year period. We would resample 60 tournaments (15% of 400), say, 10,000 times. This would create an expected distribution of the mean number of wins for 60 tournaments out of 400 for this predefined cohort of players. Then we can compare that to the number of actual wins in the 60 majors by this cohort. I would be surprised if it fell outside of one standard error—-with no expected bias in either direction. I believe if one did that for all Majors at all ages we would find the same result. But back to Tiger and his decline rate.

          The rate of decline curve for absolute wins may not be sufficient as it has a floor of zero. It may overstate rate of decline for better players relative to great players as the absolute win method would simply hit a wall. Perhaps simply rate of decline of number of top ten finishes is better. We could create a rate of decline curve for the whole time period, or use subsets of time for different but sequential curves. We can also look at the residuals for each player, relative to the average of all players to create a standard error of decline—-adjusted for number of tournaments each plays.

          Then we could apply this average rate of decline to Tiger’s current average win rate (say for last 50 tournaments) and project forward. We can also resample his win rate in majors compared to his over all win rate—but again I predict no statistical significance.

          We could also compare Tiger’s climb rate with that same cohort from age 21. Or use all players 36 and over who were ranked in the top 25 at least 3 times prior to age 36. Perhaps his climb rate is better than their’s and we might be able to project that his decline rate will also be better. Again, we would use the same tournaments each year and the same cohort every year. If we change who is in the cohort, we lose the impact of how a typical golfer changes through time. We need to factor the probability that individual golfers can self destruct (e.g., David Duval, Johnny Miller).

          But what would this kind of analysis really accomplish? It may be useful as a barebones outline of a general arc of a golfer’s career (Bill James used to do this for baseball players. He found out that the younger a player entered the league, the more likely his arc would rise faster, last longer near the top, and decline slower. Perhaps the same is true for golfers.). But if we really are just looking at Tiger Woods and the probability he will win 5 more majors we can look at his projected arc and almost any reasonable set of assumptions will likely say 60% (because he has 14 at the same age as Jack had 12)

          If we start putting subjective judgements into the equation, then in this instance it really is not inferential statistics but an educated guess using assumptions. My educated guess using assumptions is if he wins one in the next two years he breaks the record—–he needs to get the ball rolling. But I still prefer my previous note.

  2. Andrew, the Majors are radically different from other tournaments for two reasons: the strength of their fields (others have only a portion of the top players), but most of all because the top golfers want to win them so much more intensely than any other tournament. Many top players design their schedule for the entire year in an effort to peak on the weeks of the Majors. Add to that the choke factor in golf, which is arguably greater than in any other sport (the individual, standing in the open all by himself, after lots of time to think about what he is about to do, drowning in adrenalin in a sport where adrenalin is problematic). Viz Jim Furyk’s duck hook on a late hole of the final round last Sunday, which cost him the tournament–a mistake he probably doesn’t make more than once a year, maybe less than that, in any other tournament–is a classic example. Luck is an issue (Greg Norman lost two majors solely because his opponent holed out long shots from off the green), but the win is the decisive variable for the issue I’m addressing–how far above other Majors winners will he have performed if he wins at least 5 more majors

  3. The points. Majors are different, particularly the US Open: the courses are set up to be considerably more difficult (much greater penalty for inaccuracy of shot) than standard PGA events; the winning score tells it all. Thus, relying on winning, overall, tells one little about the probability of winning a Major. Among the majors, the Masters is held on the same course, sort of. After Tiger torched it, it was modified in ways to *explicitly* hinder Tiger. The Masters isn’t a Major by skill demand, only convention. The PGA varies in locale, while the British Open rotates among a (relatively) fixed number of courses, and prior to Palmer (and the 707 making it easier to get there) making a big deal of it, wasn’t played all that much by top tier American golfers. The British Open courses are more difficult for American golfers, often, because the courses are very different from American, and require a different style of play.

    Tiger has been (in)famous, especially among some professional golfers, for tailoring his schedule to fit the Majors, thus presumably giving him an advantage. Whether that advantage sustains from now on is another question not dealt with by historical data.

    In all, this is not a stat problem at all (redundant grammar, I know) with the given data. It is about relative physical/psychological factors at play (so to speak) in future. I was living in Pennsylvania when Tiger blew away the Masters, and a friend, an avid golfer and bankster, allowed as how Tiger was very unique and would win as no one before him. Smart guy, that bankster. Eventually, Tiger decided to mess with his swing, and it was downhill from there. Thanksgiving or not, by then he was spraying his drives all over Hell’s Half Acre (it’s only in the last few tournaments that he’s stopped over-swinging). If you look here: http://en.wikipedia.org/wiki/Tiger_woods#Major_championships you can see that 8 were won early on. He’d already slacked off by 2003 (or 2002, depending how you’re counting).

    The factors determining winning golf tournaments, like income or wealth, can only be measured relatively. There is no absolute measure of any aspect, unlike degrees Kelvin; previous wins alone tell one nothing material. If they did, 2003-2004 would be a bit different. So, in order to “bet” on Tiger beating Jack one merely needs to know one thing (not doable through stats, IMHO): is Tiger materially better than The Field and The Course at each Major for the next decade? Right now, clearly he is not. There are clearly young players with very good skills, although perhaps not so many older players to challenge him. In order to understand the meaning of his 1999-2002 record, one needs to quantify that relative quality and then establish how that quality will change over the next decade or so; the given data doesn’t do that nor do I have any idea whether such data can be constructed. My bankster friend believed that Tiger was far and away better than his peers, but never had anything to say about the quality of The Field during that time.

    In sum, to model the winning probability one needs to know:
    – relative skill level of Tiger versus The Field in previous events (has he won only against weak fields?)
    – relative difficulty of Major courses in previous events (has he won only on courses suited to his swing?)
    – those two quantities for known future Major events (who and where will he play?); the schedules are known some years in advance

    As with all athletes, the issue isn’t that the player is worse or The Field is better, but both.

  4. Trying to come up with a model of this sort in order to predict how many majors Tiger Woods will win is absolutely ridiculous. The last post asks the relevant questions and also points out to what extent Woods has excelled in very few years and on specific courses. Any professional gambler betting on this outcome (and I would do that, if the odds were appealing) would ask and try to the answer the 3 questions in the post above. The model the post is referring to adds way too much irrelevant data on way too many players. Tiger Woods was for a few years not only the greatest golfer ever, but maybe the greatest sportsmen ever – why the heck would one want to compare such an individual to lesser golfers? Would you compare Manchester United with my local soccer team? Of course not.

    In addition to the above 3 questions the main one is: Can Tiger Woods achieve previous years level? This is not something that can be extrapolated based on historical data. How motivated is he, how broken is he, and how much has the competition improved…Time will tell.

    Btw., concerning choice of tournaments. One could include The Players Championship, e.g., a tournament that in some ways is tougher to win than any majors, since the most of the highest ranking golfers participate. It’s not a course Tiger likes, so any inclusion will influence the “result” of any simple form of modelling significantly.

    • Are you saying, anonymous, that no conceivable set of historic information could provide any helpful information about this probability? While that’s of course fine from a frequentist persepetive, it seems an odd position to be espousing on this blog. See http://www.stat.columbia.edu/~gelman/research/published/augie4.pdf And I agree on including the Players, but then you might usefully include the Memorial Tournament as well, which would then shift things back the other way. In any case, there are a number of near-majors which it would be the task of the analyst to include or exclude, defending their decision.

      • I tried to be quite clear about how it was information about other/lesser players that would not inform TWs future. This is a point about sports, real life, talent and whatever Tiger has been through.

        And concerning Memorial: My point exactly. Count the number of major-courses that Tiger could be expected to win on – that’s interesting. St. Andrews pops up a few times, e.g.

  5. I agree with both Robert and the anonymous poster. The data do not exist to properly answer this question. I do not see how the achievements of lesser golfers could logically predict anything about the potential achievements of Tiger Woods. He represents an outlier, in terms of his ability. Maybe not as much now as in the past, but that very relationship (his present vs. his past ability) contains more uncertainty and likely has a greater predictive power than the past achievements of other golfers.

    The other major flaw in this analysis is the assumption that athletes across history (1934 to 2012) have individually deteriorated with age at the same rate. That seems highly unlikely, given modern medicine. Maybe the differences in golf are not quite so apparent as other sports, but the modern athlete has a training regimen and diet options that were not available for athletes even 20 years ago.

    • I’m not sure I know what you mean by “properly answer this question.” Any method you use will come up with a probability that this proposition is true, even simple introspection. If I make use of data that has some marginal relevance to the estimation of this proposition, won’t I expect to do better (no, I don’t have a definition of that either, really) than if I don’t make use of the data? Isn’t that how all probabilistic models work? Now if all you’re saying is that some model you’ve never seen won’t convince you because you’ve already made your mind up that the data can’t help answer the question, that’s an interesting fact about you, but seems a little… prejudiced… until you’ve actually seen the analysis someone has done.

      • “the data do not exist to properly answer this question” == “the data can’t help answer the question”

        Like I said, Woods is/was an outlier. I don’t see how performances from the population of pro golfers over the past 80 years could be used to predict his performance.

        If you’re asking what’s the probability of any golfer winning the Master’s at age 36, then the data exist to answer it. We have a population of golfers and we can make inferences about the average golfer. We don’t have a population of Tiger Woods – we have one man. So how can we make inference about a unique individual from statistics on a population?

        • Maybe the bigger Bayesians than me can begin to answer this question, but let me start off… (1) So what would you say Wood’s probability is? Is there absolutely no data that you can even imagine about what any set of golfers has ever done in the history of golf that would lead you to change that assessment by 0.000001? (2) Hierarchical models borrow information from other, arguably not-directly-comparable situations to more preciseley estimate outliers. Are you saying these methods are stupid? (3) Usain Bolt is an outlier among runners. Is there no way of predicting his probability of winning a race? (4) Have you looj=ked at Connolyy and Rendleman (cited above. Are they full of it? Why?

  6. Jonathan – there might be data that could move the assessment by 0.000001 – but it might be that it moves in the wrong direction!! There is no clear standard, in this situation, to know whether that is the case. To quote Dan: “We don’t have a population of Tiger Woods – we have one man. So how can we make inference about a unique individual from statistics on a population?”

    And again. Of course we can try to predict and assess probabilities. This point is made very clearly by both Dan and I.

    Btw., the current odds available for Bolt to win the Olympics seem quite fair to me, actually.

    • (1) What is “the wrong direction?” If I estimate that the probability is 0.25, and I change that assessment to 0.2, and it turns out to happen, that doesn’t mean it changed in “the wrong direction.” 0.24 might be more correct than 0.25 even if the event turns out to happen.

      (2) I agee that we have only one Tiger Woods. That’s what makes assessment of the probability almost impossible. It will turn out to be 0 or 1, so the “error” will inevitably be either p or 1-p. But I completely reject the frequentist assumption that that means we can’t make inferences from data on Woods and others just because Woods is only one guy. Statistics is about asking the question what inferences one can draw from data. It certainly doesn’t require “balls in an urn-like” frequentist properties.

      (3) Usain Bolt is just one guy. You on’t know what he’s going to have for breakfast that morning, or, for that matter, whether or not he’ll even compete. He is also an outlier. Your comfort in calculating odds for him in one event in the near future is different only in the number of compound predictions you have to make.

Comments are closed.