According to a recent article, a computer program can now play a decent game of Go on a 9×9 grid. I really wasn’t sure this would ever happen. Andrew and I went to a talk about programming computers to play Go, about twelve years ago, and I remember that the programmers of two of the best computer Go programs were proud just because when their programs played each other, they usually played on the same portion of the board: apparently just figuring out what part of the board is worth fighting over is a hard problem, or at least hard to program.
I guess I should explain that in Go, two players take turns placing colored stones at the intersections of a grid, trying to control as much space as possible. If a group of one player’s stones are completely surrounded by the other player’s stones, the surrounded stones are removed from the board. You basically play until the board is full (well, not quite, but close enough). For more details, you can check wikipedia. For various reasons that were fairly clear when the programmers explained them, this is not an “easy” problem like chess.
In chess, there are maybe 20 possible moves at a given point in the game, and each leads to 20 choices for the other player, and so on. Each “I do this, he does that” is called a “move” —the “I do this” part or the “he does that” part is called a “half move” — so each move creates about 400 branches. A dedicated chess computer can evaluate several billion positions per second (isn’t modern technology amazing?), so in about 4 minutes it can investigate every single branch to a point 5 moves down the road (that’s 10 half-moves). At that point it can throw out 99.9999999% of the possibilities — like the ones that leave it down a piece with no compensation, or the ones that leave its opponent down a piece with no compensation (because its opponent wouldn’t choose those branches either) —- and can then spend another minute or so looking farther down the promising alleys, going as far as 20 half-moves in some cases. Back when computers were a lot slower — in the late 1980s, for example — programmers put a lot of effort into figuring out how to evaluate each position so they could “trim the tree” earlier, and there was a general sense that we would learn about chess, and about “artificial intelligence,” by working on the evaluation function. But “speed kills”, and although the evaluation function for a top chess computer is indeed fairly sophisticated, the main improvement in the programs has been to simply look deeper.
With Go, the biggest problem is that there is no obvious way to “trim the tree.” Oh, sure, there are a few situations in which, looking several half-moves down the road, you can tell that one player has “wasted” their moves because their pieces have been captured…but there aren’t very many of these. If players just plunk down pieces at random for a while, even a long while, the odds are pretty good that neither player will have any pieces actually captured, so the number of pieces on the board will still be equal. That means that the approach that was so successful in chess — just look at every possibility as far down the road as you can, and pick the possibility that gives you a material advantage (or other easily measured advantage) even if the other player plays optimally too — can’t work in Go, because there is no “easily measured advantage.”
I haven’t had a chance to read the paper describing the approach taken by the new Go program, “Bandit based Monte-Carlo Planning” by Levente Kocsis and Csaba Szepesvari, but it seems like it should be worth at least a glance.