Hangman tips

Posted on April 7, 2012 9:43 AM by Andrew

Jeff pointed me to this article by Nick Berry. It’s kind of fun but of course if you know your opponent will be following this strategy you can figure out how to outwit it. Also, Berry writes that ETAOIN SHRDLU CMFWYP VBGKQJ XZ is the “ordering of letter frequency in English language.” Indeed this is the conventional ordering but nobody thinks it’s right anymore. See here (with further discussion here). I wonder what corpus he’s using.

P.S. Klutz was my personal standby.

P.P.S. Jazzy is a good option too. People will guess the “a” right away but get stuck on all the other letters.

10 thoughts on “Hangman tips”

Brett Keller on April 7, 2012 10:33 AM at 10:33 am said:

Somewhat relevant: Marek Hlavac wrote up a version of Hangman for STATA. You just download the .do flies at https://sites.google.com/site/marekhlavac/stata_games
MAYO on April 7, 2012 12:41 PM at 12:41 pm said:

It’s not the frequency of a letter that matters most for hangman, but rather how easy it will be to guess the word once you find out it includes/excludes a given letter. That’s how I play, at any rate.
Mark Palko on April 7, 2012 5:09 PM at 5:09 pm said:

“Rhythm” is my favorite.
Phil on April 7, 2012 8:09 PM at 8:09 pm said:

“Queueing” is good too.
Rick Wicklin on April 9, 2012 10:33 AM at 10:33 am said:

“Coffee”
konrad on April 9, 2012 2:12 PM at 2:12 pm said:

I agree with MAYO: even if you ignore that the opponent is trying to pose words that will defeat this type of strategy, and even if you decide that a greedy algorithm (which is almost certainly suboptimal) is practical from a computational point of view, you should take into account how much information will be obtained if your guess is wrong – an approach based on decision trees might be more convincing. And the best strategy is also a function of the number of guesses remaining.
Bob Carpenter on April 9, 2012 2:54 PM at 2:54 pm said:

Agreeing on a legal dictionary ahead of time is trickier than you might think, even in English. New words can be produced by adding suffixes and prefixes to existing words. For instance, check out a blog entry where I list dozens of words found in mainstream English newswire derived from the stem “author”, including words like “multiauthored” and “authoritativeness”. This is even trickier in languages like Russian (highly inflected) or Turkish (highly agglutinative).

Next, suppose you know the probability distribution of the words making up puzzles. Again, a really big step.

Then, you have an obvious near optimal greedy strategy. At each guess, look at the set of legal remaining words and find the letter that most neatly divides the probability mass in two. Guess that letter and continue.

Even this isn’t optimal. You need to overcome the greediness bias and unfold the entire game tree to choose an optimal set of guesses leading to a guaranteed minimal number of steps. All very finite, but combinatorially prohibitive for longer words (though there are probably heuristics, just like for Sudoku).
Pingback: Suboptimal Hangman Strategy » Source-Filter
konrad on April 10, 2012 2:13 PM at 2:13 pm said:

@Bob: That’s more or less what I had in mind, except remember that you what matters is not the number of guesses but the number of _wrong_ guesses. So instead of roughly halving the probability mass you want to make guesses that are very likely correct and provide lots of information when wrong. The amount of information gained when guessing right doesn’t matter much, because there is no cost. The original article is closer to this than halving the prob mass.

What may be a tricky detail (making the greedy approach less suitable) is that one should not try to minimize the number of wrong guesses but rather maximize the probability of completing the word without exceeding N wrong guesses. Seems to me this encourages building the tree out to N levels (remembering that only wrong guesses increase the level – this makes the branching factor huge) rather than being greedy – the greedy approach may often be bad by risking a wrong guess in order to gain information which ends up being unimportant (e.g. strongly correlated with informaton you have to take a risk to gain later anyway).

Where the problem becomes interesting is when N is large enough for full tree exploration to be intractable. It is not obvious what evaluation function to use when terminating the search at a non-leaf node. Number of possible words remaining is one option, but not necessarily best.
Andrew on April 10, 2012 2:19 PM at 2:19 pm said:

I’d say that we should connect this thread to our research discussions on the death penalty, but that would be in poor taste.

Comments are closed.