Wednesday, May 22, 2019

What do chess engines understand about chess?

[This post assumes a familiarity with terminology from previous posts.  Rather than try to gloss everything, I'm going to punt and respectfully ask the confused reader to at least skim through the previous few posts]

It's clear from watching a bunch of games that the two main types of chess engines have different styles, though not all that different.  It's not as though anyone suddenly decided 1. g4 was a great opening.  In broad strokes:

AB engines play very precisely and their evaluations of positions are fairly conservative.  It's quite common for an NN engine to give a large advantage to one side in a position that an AB engine rates as fairly even.  The ratings eventually come in line, but the convergence can happen in either direction.

If the NN engine is right, the AB eval comes up to meet it as it exploits an advantage that the AB engine missed.  If it's wrong, the NN eval comes down and the result is often a draw, either because the NN engine failed to play precisely enough to convert a winning advantage, or the winning advantage was never really there to begin with.  It would be interesting to track down statistics on this, and any interesting exceptions, but I don't have easy access to the data.

When NN engines first hit the scene, the initial take was that neural nets had captured some subtlety of positional understanding that the AB engines, with their relatively simple hard-coded evaluation rules couldn't fathom, even given the ability to look many moves ahead.  This is mostly accurate, I think, but incomplete.

In the Blitz Bonanza, there were any number of NN wins over AB engines, at least the weaker ones, that started with the advance variation of the French Defense (1. e4 e3 2. d4 d4 followed soon after by white e5).  This opens up a lot of space for white on the kingside and tends to push the black pieces over to the queenside -- or maybe "lure" is a better word, since there are plausible things for them to do there, including attacking white's pawn center.  The problem is that white can now launch a fairly potent kingside attack and black doesn't have time to get enough pieces over to defend.

Apparently none of this looks too bad to an AB engine.  Yes, white has an advanced pawn in the center, but there are chances to counterattack and weaken that pawn.  Once the white attack really gets going, often aggressively pushing the g and h pawns, its king may start to look correspondingly weak -- except black doesn't have any pieces well-located to attack it.  A typical AB engine's eval function doesn't try to account for that -- looking ahead it will either find an attack or it won't.

Probably this disconnect of white seeing good chances and black finding plusses and minuses to the same moves continues until black can see actual tactical threats coming, at which point it's toast since its pieces are now bottled up on the wrong side of the board.  The natural explanation is that the NN engine understands that there are good attacking prospects on the kingside and blacks pieces are poorly placed to defend.

What's actually going on is probably a bit different.  In its training phase, the NN found that positions with, say, more black pieces on the queenside, lots of open squares on the kingside, and pawns pushed up near the black king tend to lead to wins.  A human playing in the same style would probably say "My plan was to open up some space, get black's pieces bottled up on the queenside and launch a kingside attack".  But here there is no plan.  White is playing moves that tended to work well in similar situations in training.

As I mentioned in a previous post, this lack of a plan becomes blindingly obvious in endgames.  Many endgames have an almost mathematical feel to them.  White is trying to hold a draw.  Black is up a bishop for two pawns.  Picking off one of those pawns would allow one of black's own pawns to promote unimpeded.  But white's pawns are all on the wrong color square for the bishop to attack and they're blocked against black's pawns (and vice versa), so all white has to do is keep black's king away.

Conversely, though, if black can draw white's king away from where it's blocking black's king, it can sneak its own king through and pick up the undefended pawns.  And there just happens to be a tactical threat that will accomplish that ...

A human player can figure positions like this out logically and win or hold the draw, as the case may be, even against perfect play.  An AB engine can often look ahead far enough to find the winning variation.  An NN engine might or might not stumble on the right move.

Likewise, it's common for an NN engine to see a large advantage in a position that's clearly a draw.  Suppose in the previous scenario there's a way for white to safely shuffle its king back and forth between two squares and keep the black king from getting through.  Quite likely the AB engine will evaluate the position as dead even, and the human kibitzers will agree. The NN engine, if it's playing black in this example, will continue to see a sizable advantage almost up to the 50-move limit, even though all it's doing is rearranging its pieces in order to avoid threefold repetition.

If understanding chess means having a feel for what kind of positions have good or bad prospects without trying to calculate every possible variation, that is, playing like a human, then NN engines are clearly doing something similar.

On the other hand, if understanding chess means being able to find the right moves to win or draw a difficult endgame, that is, playing like a human, then AB engines clearly have the upper hand.

But if understanding chess means being able to reason about why an endgame is won, lost or drawn, or being able to put together a general plan of attack without knowing in advance exactly what moves need to happen in what order, that is, playing like a human, then neither type of engine is even trying.

No comments:

Post a Comment