Antichess Stockfish vs. Wizard 3.1: a comparison • page 1/1 • Lichess Feedback • lichess.org

Zapmeister edited

firstly well done to the devs on here for succesfully creating a version of antichess for stockfish.

i thought i might share a comparison of it to goldovski's wizard 3.1 program for antichess released in 1998. so i made stockfish level 8 play wizard 3.1 in 12 games, alternating colours each time, to see what happens.

wizard 3.1 is available for download here http://poincare.matf.bg.ac.rs/~andrew/suicide/StanGold/wizard3_1.htm

it apparently has a rating of 2620 on FICS server: http://poincare.matf.bg.ac.rs/~andrew/suicide/StanGold/top20.htm#programs

the program describes itself as having "free" and "paid" versions. the free version is the one i'll be using, which (usually) caps processing time per move at 3 seconds. (in the readme, it says that the paid version would have been accessible by sending $10 to the guy who created the program. but he died in 1999, so i don't think the paid version of this program exists anywhere.)

--------

result: Wizard 3.1 won, with a score of Wizard 7.5 - 4.5 Stockfish Level 8

--------

some comments. stockfish and wizard appear to be about equal strength at middlegame battles, but stockfish is TERRIBLE (i mean it) at endgames. in particular, wizard often won a lot of games by reducing stockfish to just a bare king, while keeping a lot of material, especially rooks, and then restraining the king so it eventually has to take all wizard's remaining pieces, so wizard wins while stockfish only has a king left.

stockfish seems to **REALLY** overvalue the king in antichess, especially in the endgame. in antichess, the king is the most useful piece as its safe moves avoid zugzwangs, but it's not nearly as valuable as stockfish says it is. promoting pawns to king is a good strategy if you think you're losing and trying to draw, but it doesn't work well if you want to win. rooks are the piece of choice for promotion if you want to win.

there are KvR endgame positions where stockfish gives +1 or more to the side with the king, even though KvR is a very basic endgame, always won for the rook. this is a serious issue that really ought to get fixed immediately.

--------

as for opening theory, i can't say much because both computers tried to take each other out of book early on. white's best move 1.e3 was NEVER played in any of these 12 games after the first one. the computers both preferred 1.c4 followed by a tactical queen battle in the middlegame

--------

here are the games

game 1: Stockfish 0-1 Wizard
en.lichess.org/YpE5w6B5

this is an example of stockfish overpowering the king in the endgame. after 63.Kc6, stockfish says +1.3 to white even though the position is completely lost.

game 2: Wizard 1-0 Stockfish
en.lichess.org/28bZHF5mOX2P

after 5.bxa6, wizard gave +6.14 to white, but stockfish says -1.8. wizard found a forced win after 22.d4 but stockfish only gives +0.7 to white

game 3: Stockfish 1-0 Wizard
en.lichess.org/eddkAGk45NeJ

after move 13, stockfish says +0.1 for white, but wizard says -6.23 for black. but wizard misjudges the position after move 23 (Ke7 appears to be a mistake, it says -2.56 for black but a move later it says white mates in 8)

game 4: Wizard 1-0 Stockfish
en.lichess.org/9E9gIdUMFLP4

again, stockfish is awful at endgames. after 52.a8=R stockfish says -1.5 for black, but RvK is a win for the rook

game 5: Stockfish 0-1 Wizard
en.lichess.org/RObRTleSAq08

after 28.Kd1, stockfish says +3.7 for white, but wizard says -2.79 for black (it was at +4 for white a few moves ago)

game 6: Wizard 0-1 Stockfish
en.lichess.org/OxPrFCjYEtDf

wizard misjudges a middlegame position and stockfish gets a quick win

game 7: Stockfish 1-0 Wizard
en.lichess.org/xkhbts0mJWfF

11...Nh6?? was a disaster for wizard

game 8: Wizard 1-0 Stockfish
en.lichess.org/lWr6pSnF7h7F

after move 14, stockfish, giving itself -2.3 in favour, decides to get rid of all its pieces except 1 pawn, one after another. wizard had all this planned out, and as soon as stockfish reduced itself to just a pawn, announced a forced win in 12, while stockfish still said -2.9 for black. oh dear.

game 9: Stockfish 0.5-0.5 Wizard
en.lichess.org/etdLEAHj6s1A

this was a weird one. if you click the link it says stockfish resigned? WTF? maybe it was getting bored of all the move shuffling and decided to resign instead of prolonging the boredom. but anyway, the position is a dead draw. the kings set up some sort of fortress that the rooks can't invade, and the kings have no chance of winning themselves. however, both computers think they're winning; stockfish by +2.5 and wizard by -6.3, so while there were many opportunities for either side to take a draw by repetition beforehand, neither side took it. the monstrosity of shuffling is basically playing out the 50 move rule in action, then black gives up a rook then a knight to prolong the game, thinking it's still ahead. but then stockfish resigned for some reason... yeah i don't know what the hell happened. but i'll give a draw because the position is a stone cold draw.

game 10: Wizard 1-0 Stockfish
en.lichess.org/LkUVJqDvOlTJ

wizard finds a path to victory in a complicated middlegame. this was wizard's first non-endgame victory

game 11: Stockfish 1-0 Wizard
en.lichess.org/HBvqJnU24rr0

basically the same as game 10, except stockfish wins the middlegame queen battle

game 12: Wizard 1-0 Stockfish
en.lichess.org/eEZj5Bc0tEn0

this game definitely showed a huge difference in the evaluation functions of wizard and stockfish. after 15.Qxg7 wizard said it finds a forced win in 15 moves, but stockfish still said -2.5 in favour of black! even after move 20 stockfish still gives a score of 0.

--------

conclusion: stockfish antichess is nice, but it still has a lot to learn. goldowski's wizard program, despite being 18 years old, easily crushes stockfish, especially at endgames. well done to the devs for training stockfish at antichess, but i think i'll stick with wizard for now. oh and give it an endgame tablebase, and fix the bug where it resigns a drawn position.

FM ubdip

Thanks for comparing the engines and analyzing the strengths and weaknesses. I will read the quite long text more thoroughly later. Could you please just add which time control you used for Wizard and on which hardware, so that we can better compare the playing strength?

Antichess is so much different from other variants, so I had to experiment a lot with the search and evaluation function to get it to a decent level. However, the evaluation function still is not very sophisticated, so it is not surprising that it misevaluates some positions, especially when it only has 400ms per move. Hopefully, I will find some improvements based on your observations, but it won't happen "immediately".

Zapmeister

hey ubdip, thanks for responding.

wizard's readme file says that the time it takes is 3 seconds per move, and that this setting can't be increased unless you pay him $10. however in practice what the program actually does is calculate to 3 seconds (or just before), and then finish calculating at the ply depth it was in the middle of doing (usually like 11 or 12), and then play that move. so like for instance, if it calculated at depth 11 ply starting at 1.8 seconds, it would complete the 11 ply calculation even if it went past the 3 second limit. so in practice it takes like between 2.5 and 5 seconds per move.

in VERY rare cases - it happened twice in the entire 12 games - it may be the case that a search at depth N takes a really long time compared to depth N-1, in which case it might "hang" for up to a few minutes. the guy says this is a bug. (the software does have a few bugs in it... it's 18 years old and written by ONE GUY, who was only a hobbyist programmer!)

my pc specs are windows 7, 64-bit, 2.7ghz processor, 3gb ram, 300gb hard drive

if i feel like it i might do the same thing for sjeng or nilatac as well (two other strong antichess engines). but those ones work differently, they do it by proof number search and return a score based on proof node / disproof node ratio.

FM ubdip edited

So the time per move differs by about an order magnitude (not taking into account differences in hardware) which, at this level, might correspond to about 200-300 Elo, I guess.

In Antichess, the results of engine matches probably heavily depend on the opening choice, so I think we can only roughly estimate the actual playing strength from such a match without an opening book.

I am re-tuning piece values to see whether they change, especially the king and rook values.

FM ubdip

@Zapmeister When tuning, the piece values indeed go into the direction you suspected, especially the endgame rook value has changed a lot.

tolius

@Zapmeister I guess it's mostly because of calculation time. If you look at the analysis of the first game, you will find that Stockfish level 8 had a winning position and checkmate in 11 (after black's move 35). And even more: every game, Stockfish level 8 made a lot of blunders and mistakes (according to the analysis with Stockfish level 8++). I don't know what thinking time they use for Analysis, but it shows Stockfish certainly can play better with more thinking time.
My experience: I used Stockfish on my local PC (even the version of antichess-Stockfish when all pieces had the same value) and yes, sometimes the endgames were strange. But when I increased the thinking time, it went to depth 40-50 (within a few seconds when a few pieces on the board) and found a better solution. So, tuning the piece values improves the calculation of course (I mean Stockfish can find the right solution in a shorter time), but the computation time is also very important. As the Stockfish code is very optimized, I believe Stockfish can even win the match against Goldovski's Wizard 3.1 under the same conditions (even when the tuning of the piece values is not finished yet).