Josh Miller writes:
I came across your paper in the Journal of Management on unreplicable research, and in it you illustrate a point about the null hypothesis via the hot hand literature.
I am writing you because I’d like to move your current prior (even if our work uses a classical approach). I am also curious to hear your thoughts about what my co-author and I have done.
We have some new experimental and empirical work showing that the hot hand phenomenon can be substantial in individual players. We think our measures are more tightly related to hot hand shooting (rather than cold hand shooting).
Also, we find clear evidence of hot hand shooting in Gilovich, Vallone & Tversky’s original data set.
Our new paper, “A Cold Shower for the Hot Hand Fallacy,” is on SSRN, here.
We have three comments on your discussion in Journal of Management:
1. The earlier reported small effect sizes come about for three main reasons (1) pooling data across players means the guys who don’t get hot (or fall apart) attenuate average effects so you don’t see the hot guys, (2) the measurement error story of D. Stone (who I see commented on your blog once), (3) not every streak is a hot streak, so the real infrequent, but persistent hot hands get diluted; you would need to measure something else in conjunction with shot outcomes to pick this up.
2. We overturn the basic findings of GVT: it is not a fallacy to believe that some players can get substantial hot hands. We have the proof of concept in 3 separate controlled shooting studies (GVTs, on earlier one, and our own). We have more discussion on this in the paper.
3. Now, while it is no longer that case that believing in the hot hand is a fallacy, there remains a question which you pose as answered: to what extent do players, coaches or fans overestimate hot hand effects based on shot outcomes alone? An important first point: this overestimation wasn’t the main point of GVT because it is really hard to show that players and coaches are over-estimating the impact via the decisions they make (stated beliefs would be a little silly, but these haven’t been asked cleanly). GVT did something cleaner: show no effect and then you know any belief in the hot hand must be fallacious.
The question I have for you: do you think Bill Russell, thought to be the greatest teamplayer of all time, was wrong when he said this? (retirement letter to SI):
People didn’t give us credit for being as good as we were last season. Personally, I think we won because we had the best team in the league. Some guys talked about all the stars on the other teams, and they quote statistics to show other teams were better. Let’s talk about statistics. The important statistics in basketball are supposed to be points scored, rebounds and assists. But nobody keeps statistics on other important things—the good fake you make that helps your teammate score; the bad pass you force the other team to make; the good long pass you make that sets up another pass that sets up another pass that leads to a score; the way you recognize when one of your teammates has a hot hand that night and you give up your own shot so he can take it. All of those things. Those were some of the things we excelled in that you won’t find in the statistics. There was only one statistic that was important to us—won and lost.
Because if you read GVT 1985 and GT 1989 papers in chance, that is the message you get.
Here’s the relevant passage from my recent article in the Journal of Management:
As an example, consider the continuing controversy regarding the “hot hand” in basket- ball. Ever since the celebrated study of Gilovich, Vallone, and Tversky (1985) found no evidence of serial correlation in the successive shots of college and professional basketball players, people have been combing sports statistics to discover in what settings, if any, the hot hand might appear. Yaari (2012) points to some studies that have found time dependence in basketball, baseball, volleyball, and bowling, and this is sometimes presented as a debate: Does the hot hand exist or not?
A better framing is to start from the position that the effects are certainly not zero. Athletes are not machines, and anything that can affect their expectations (for example, success in previous tries) should affect their performance—one way or another. To put it another way, there is little debate that a “cold hand” can exist: It is no surprise that a player will be less successful if he or she is sick, or injured, or playing against excellent defense. Occasional periods of poor performance will manifest themselves as a small positive time correlation when data are aggregated.
However, the effects that have been seen are small, on the order of 2 percentage points (for example, the probability of a success in some sports task might be 45% if a player is “hot” and 43% otherwise). These small average differences exist amid a huge amount of variation, not just among players but also across different scenarios for a particular player. Sometimes if you succeed, you will stay relaxed and focused; other times you can succeed and get overconfident.
Whatever the latest results on particular sports, we cannot see anyone overturning the basic finding of Gilovich et al. (1985) that players and spectators alike will perceive the hot hand even when it does not exist and dramatically overestimate the magnitude and consistency of any hot-hand phenomenon that does exist. In short, this is yet another problem where much is lost by going down the standard route of null hypothesis testing. Better to start with the admission of variation in the effect and go from there.
And here is my response to Miller:
What is your estimated difference in probability of successful shot in pre-chosen hot and non-hot situations? I didn’t see this number in your paper, but my impression from earlier literature is that any effect is on the order of magnitude of 2 percentage points, which is not zero but is small compared to people’s subjective perceptions. My own experience, if this helps any, is that I do feel that I have a hot hand when I’m making a basketball shot, but that feeling of hotness is coming as a consequence of the pleasant but largely random event of my shot happening to fall into the hoop. To me, the hot hand fallacy is not such a surprise; it is consistent with the “illusion of control” (to use another psychology catchphrase).
The Bill Russell quote is interesting, but given the findings of the classic hot hand paper, it is not surprising to me that a player would view the hot hand as a major factor, whether or not it is indeed important. Players can believe in all sorts of conventional wisdom. Of course I agree with Russell’s statement that all that matters is wins and losses. I’d guess that points scored for and against is a pretty important statistic too. All the other statistics we see are just imperfect attempts to better understand point-scoring.
To which Miller replied:
On your first question, I’ll give you the quick measure but it depends on the player. Lets compare the hit rate after making 3+ shots in a row to the hit rate in any other shooting situation. For the player RC, because he was nearly significant in the first session, we followed up with him 6 months later to see if this predicted a hot hand out of sample, and it did: on average his boost was 8-9 percentage points across all session (see p. 25 for the difference). The hottest player in the JNI data set of 6 players, with 9 different shooting sessions (he had periods of elevated performance in all sessions), had a boost of around 13 percentage points (see p. 28 for the difference). In GVTs data, 8 out 26 shooters had a 10 percentage point plus boost , and 4 more than plus 20 percentage points (see page 29 for a brief report). Its clear some player’s have substantial boosts in performance, but yes, the average effect is modest as in previous studies, around a 3-5 percentage point boost. I think the important point is not that the hot hand is some big average effect, but that some players have a tendency to be streaky.
On Russell. Players receive information beyond sequential shot outcome data; they have a long experience playing with teammates and they can get cues on a player’s underlying mental and physical state to use in conjunction with sequential outcome data, so in that sense outcome data may be more informative for them than it would be for a fan. Further, the mechanism to get hot doesn’t always mean you have a positive feedback of shot outcomes into a player ability, another mechanism is endogenous fluctuations in mental and physiological state, or exogenous input such as energy from fans, teammates, etc. In this case, for teammates and coaches, the cues on mental and physical state are more important than the shot outcome. Notice if you take the original Cognitive Psychology paper and two Chance papers, the message is against both mechanisms.
Now, just to clarify, in my personal view, the tendency for spectators to attach too much meaning to streaks is clearly there, we can see it any time when we watch the 3-point contest, the videos are on youtube. Anytime a player hits three shots he is “heating up.” This is the pattern of intuitive judgement that GVT identified, and this is interesting psychologically, and it was predicted by previous lab experiments. Instead, we approach it from the perspective of whether there is strong evidence that players and coaches are wildly off. Our evidence suggests their belief can be justified, but we don’t demonstrate that it is in any particular game circumstance (no one has showed that it isn’t!). If you look at some recent interesting work from Matthew Goldman and Justin Rao, on average, players do a surprisingly good job allocating their shots.
Good stuff. I’ll just say this: I’m terrible at basketball. But every time I take a shot, I have the conviction that if I really really focus, I’ll be able to get it in.