Where exactly in your MS paper are the results for computing the bias when k>1?

I gather there’s no neat formula except when k=1, but you can easilycalculate

it for any specific k,n,p. Is that right?

This belief becomes a fallacy in an environment in which good outcomes arrive as if they have a constant rate of occurrence.

[…]

Upon arrival at the scheduled time the shooter (subject) was given several minutes to warm up by shooting however he liked.

[…]

In order to minimize the possibility of a potential fatigue effect from over-exertion we settled on 300 shots after running pilots with ex-basketball players that were not in basketball shape.

[…]

The null distribution is generated under the assumption of exchangeability within sessions (but not across, to avoid good day/bad day effects).

http://asfee2015.sciencesconf.org/61541/document

So the hot hand is defined as something other than warm-up/fatigue and good/bad day effects. So if we believe there is a fallacy here, it amounts to saying the performance level *does* change, but only at the beginning and end of a session. It just can’t change (much) in the middle of a session… I didn’t expect this.

Also, before I looked closer at that paper I made a little sim of warmup/fatigue. At least the statistics p(H_i|M_i-1) and p(Hi|H_i-1) seem pretty indistinguishable even if the probability of success varies ~12%: https://image.ibb.co/gP4yJG/hothand.png

Trying out the “code” tag this time:

n = 100

x = 150

k = .5

t = -x:x

p = 1/(1 + exp(k*(t/x)^2))

# p = rep(.5, length(t))

# simulate some data assuming warm-up and fatigue

dat = replicate(n, sapply(p, function(x) sample(0:1, 1, prob = c(1 - x, x))))

out = matrix(nrow = ncol(dat), ncol = 2)

colnames(out) = c("pHH", "pHM")

for(i in 1:ncol(dat)){

hits = which(dat[, i] == 1)

miss = which(dat[, i] == 0)

hits = hits[!hits == nrow(dat)]

miss = miss[!miss == nrow(dat)]

pHH = mean(dat[hits + 1, i])

pHM = mean(dat[miss + 1, i])

out[i, ] = cbind(pHH, pHM)

}

par(mfrow = c(2, 2))

plot(p, ylim = c(0, .5), main = paste0("k = ", k))

plot(cumsum(dat[,1]))

lines(cumsum(sample(0:1, nrow(dat), replace = T)), col = "Red", lwd = 2)

hist(out[, "pHH"], breaks = seq(0, 1, by = .01))

hist(out[, "pHM"], breaks = seq(0, 1, by = .01))

And current academic processes (anything that converts uncertainty into claims with confidence with poor expected sign and magnitude errors) enthusiastically throws these seeds of truth into a fire while reciting, “If you love me, pop and fly; if you hate me lay and die” ]]>

your point: “People can’t tell the difference between short random binary sequences with p=.45 and p=.50. Neither, it turns out, can statisticians without tons of data. We knew that before we started, didn’t we?”

nice one. this is a point we like to make (but in a different way). The conceptual distinction between these cases is that statisticians see 0s and 1s, which are not that diagnostic of the underlying probability of success, whereas people–i.e. players & coaches—see much more (e.g mechanics, mood), which *may* be diagnostic of the underlying probability. There is some suggestive evidence that this is the case. For example, even in GVT’s, when re-analyzed, players can predict shot outcomes at rates better than chance: https://ssrn.com/abstract=3032826

You say that this “has become a complex scientific explanation of something that seems to be really, really uninteresting?” I guess it depends on what your goal is, and where it goes from here. Some people think the idea of flow, the zone, or whatever is interesting to study. Others think it is uninteresting because it is completely obvious that confidence, focus, motivation and motor control will vary sufficiently so as to affect the ability of a professional athlete. Still others think (thought) that both parties are misguided because there is nothing there, they are just interpreting patterns in randomness. I’d put my money on everyone’s point having some seed of a truth that is often underestimated but also often blown out of all proportion.

]]>The idea of the hot hand is that ceteris paribus players who made the last shot are more likely to make the next one, but it is only one of many influences on whether a player makes a shot.

Thanks, but if you are warming up you will be more likely to make more shots in a row. If you are getting fatigued you are more likely to miss more shots in a row. Wouldn’t this look like hot/cold hands in the sequence data? Perhaps there is a data aggregation/reduction issue here.

]]>Yes… which is why no one thinks there’s a hot hand in free throws. Probably because they’re correct.

I don’t know much about this but a quick search shows it is a rather common belief:

68% of the fans expressed essentially the same belief for free throws, claiming that a player has “a better chance of making his second shot after making his first shot than after missing his first shot”;

The hot hand in basketball:On the misperception of random sequences

https://pdfs.semanticscholar.org/f472/0326b81d5528c0458510cd87ea7b57418c54.pdf

I find evidence for the hot hand” in that making the first free throw is associated with a significantly higher probability of making the second free throw.

Revisiting the Hot Hand Theory with Free Throw Data in a Multivariate Framework

https://econpapers.repec.org/article/bpjjqsprt/v_3a6_3ay_3a2010_3ai_3a1_3an_3a2.htm

Arkes (2010) found evidence consistent with hot hand shooting in free-throw (dead ball) data, observing that a player’s success rate is higher following a hit than a miss. Yaari and Eisenmann (2011) then replicated this result with a slightly dierent analysis, and more data.

A Cold Shower for the Hot Hand Fallacy

http://asfee2015.sciencesconf.org/61541/document

Also, something isn’t clear to me. Is it a corollary of the hot hand fallacy that warming up doesn’t work, and fatigue plays no role (ie the claim is that p(success) doesn’t increase/decrease during the session/game)?

]]>So the real question is why players are missing at all when undefended. How often do you miss when throwing the trash out (I bet p~1)? Same for some people and free throws.

]]>The probability of making a shot could well change by much more than 5 percentage points, comparing a player when he is hot to when he is cold. The point of all the statistical analysis by Miller, Sanjurjo, and many others is that the simple estimate is severely biased, both because of that weird probability thing that Miller and Sanjurjo identified, and also because of attenuation when “hotness” is measured in a highly noisy way based on the success of failure of the previous shot.

]]>What started as a simple example of the fact that human minds impose patterns on random events (with the observation that to many people roulette wheels have hot hands) has become a complex scientific explanation of something that seems to be really, really uninteresting. (And unlike Shravan, I love sports examples.) Do we really want a model of the probability that player X makes a shot in state of the world Z1,…,Zn? And we want that model not to prove we can model probabilities, but to ask the question of whether adding history variables adds significantly to our understanding of p?

I mean, I get Miller and Sanjurjo, and it was definitely useful as an statistical teaching tool. But the real animating question here is whether the probability changes *enough* so that the naive pattern-making cerebral apparatus’ impression of a change in p is indicative of a real-world change in p. And it absolutely isn’t. People can’t tell the difference between short random binary sequences with p=.45 and p=.50. Neither, it turns out, can statisticians without tons of data. We knew that before we started, didn’t we?

The hot hand phenomenon is (to me) a certainty after n shots than shot n+1 will be good. It is not refuted when shot n+1 has good defense. That’s just called “good defense broke up the hot hand.” Irrefutable.

]]>https://www.gsb.stanford.edu/insights/jeffrey-zwiebel-why-hot-hand-may-be-real-after-all

https://fivethirtyeight.com/features/baseballs-hot-hand-is-real/

]]>