Feb 28, 2007

[New Version] v0.231b - Bug fix and depth/new replacement scheme

Changes:
  • A bug in the new 'game over' check for winboard that made the engine crash was fixed
  • The transposition tables now use depth/new replacement scheme
Note: This 'hotfix' was nescessary since I had forgotten a simple out of bounds check in the winboard version. To atleast add something more in this 'new' version I added depth/new replacement scheme for the transposition tables, I have not had any problems with out of memory errors due to this (and there should not be any if I counted right), but please send me a note if you run into any. The next version will include a possibility to change the size of the transposition tables from the interface.

mediocre_v0.231b

Feb 26, 2007

[New Version] v0.23b - Evaluation

Changes:
  • Completely new evaluation, now accounts for king attacks, pawn evaluation and more
  • Check extension added
  • Draw repetition is now accounted for in the search tree as well
  • A bug concerning contempt factor was fixed
  • Piece lists added, no more 120 loops
  • Now gives correct mate count (and not just a high number) for both uci and winboard, but evaluations from the transposition tables gives the count when the mate was found (this is on the todo-list)
  • The broken 'force' mode in winboard should now be fixed
  • Winboard protocol now sends the score when the game is finished (still needs some work)
Note: The evaluation is an ongoing process and not a one time implementation. There are still a lot of adjustments to be done.

I decided to include jlaunch in this release since people have been complaining about Mediocre not working in the Shredder and Chessbase interfaces. I might not include this in the future but simply link to it.

mediocre_v0.23b

Feb 25, 2007

[Other] Adjustments to evaluation

I rewrote the greater part of the evaluation for some better nodes per second performance.

For example I only calculate king attacks if the queens are still on the board. There are different opinions about this. I think most attacks without queen are pretty harmless, and the occassional dangerous position without queens is either handled by the search or had a queen in the attack up to a point so we already should have 'seen it coming'.

Anyway this rewrite regained the end game playing strength. I think the king attack code occassionally interfered with the general play (not only with worse nodes per second), especially in the endgame.

Here is a test tournament versus the old version of Mediocre. Mediocre v0.23b has piece lists, new evaluation and check extension.
1: Mediocre v0.23b uci 64,5/100 
2: Mediocre v0.22b uci 35,5/100
So Mediocre has gotten a bit stronger.

There are still a few things to work out (connected passed pawns are handled wrong at the moment, and backwards pawns are not handled at all), and also some adjustments need to be made so we can avoid what happened in the following game:


On move 30 Mediocre v0.23b played 30. ... Re7 and evaluates the position to slightly worse. It sees the following moves where it loses the bishop, but considers the passed h-pawn to be enough compensation.

After a few weird moves it completely misses the 35. f5.

These kinds of errors are quite common for some reason, but usually the new evaluation is put to good use like here:


Both king safety, pawn evaluation and rook positioning helped winning this game.

I also need to take another look at the development in the openings, the new version is quite fond of leaving pieces undeveloped even though it gets a penalty for leaving them on the first rank. Perhaps some special code is needed here.

Feb 21, 2007

[Other] Playing around with the evaluation

I have implemented king safety, pawn evaluation, some trapped piece detection and a bunch more piece positioning evaluations. They work without bugs, but are in need of some serious tweaking.

In its current state the new evaluation takes the average calculated nodes per second from about 150.000 to 75.000. This is a bit too much, but I think I can cut it down some with better written code. Also the piece tables I have been talking about should increase the overall speed some.

Here are a couple games where the new evaluation works well.










Keep in mind that the new version searches about half as many nodes, so logically any tactical combinations should be due to the evaluation (and not out-calculating). Of course it could be luck as well.

I especially like the last game where the new version evaluated the position after 12. Bxh7 to +3.0, seeing the bishop getting trapped but figuring it would be ahead after the exchanges. This might not be sound at all, but it does indeed have a strong attack.

Unfortunately not all games work out this well, when v0.22b 'accidently' manages to castle and avoid attacks it can usually work out the position to its advantage.

This is probably both due to the slightly deeper depth it searches to, but mainly I think some parts of the evaluation actually hurts v0.23b. It seems to play a bit weird in endings and can get very careless in some positions.

So as I said there is still a lot of tweaking and optimizations to be done, but once we get there this should really improve Mediocre.

I have used ideas from TSCP (pawn eval) and Toga (king safety and trapped pieces) and once I feel the evaluation works as it should I will write a guide or two.

Feb 19, 2007

[Other] Running Mediocre in Shredder and Chessbase

The Shredder Classic and Chessbase interfaces do not like the .bat-file needed to run Mediocre, like we do when using Arena for example.

Shredder requires an .exe-file to recognize the engine as UCI (it recognize it as Winboard with the .bat-file but can not play games due to weird behavior) and Chessbase requires an .exe-file to even try install Mediocre.

The solution is to use a small program called jlaunch by Manfred Rosenboom.

For Mediocre you simply place the program in the main directory (where the .bat-files are located) and create a file called jlaunch.properties and place it in the same directory.

The properties file should look like this:
mainClassName=Mediocre
classPath=bin
app.arg0=u
You need to have paths set to the Java bin-directory on your system for this to work (try typing 'java' in a console window, if you do not get an error you have it set).

Now when installing Mediocre in Shredder Classic or Chessbase you simply use the jlaunch.exe as engine file and it should work.

This works in Arena as well.

Now I just need to track down the problems wbec-ridderkerk had with running Mediocre. That has nothing to do with the above solution since they run it as a winboard engine. I suspect it has something to do with a preset opening book.

Feb 18, 2007

[Other] A third of the results

Of course the tournament did not finish as it should, only a third of the games were played before a power outage stopped it. But anyway, all the engines played 35 matches against each other.

The time control was 3 minutes for each side and the results were:
1: Kingsout            120,5/140
2: Tscp181 84,5/140
3: Mediocre v0.23b dev 54,0/140
4: Roce350 47,5/140
5: Mediocre v0.22b 43,5/140
The only real conclusions we can draw from this is King's Out clearly being the strongest and Tscp second, the 3-5 places are too close to call.

Roce takes way too many unnescessary draws so it probably places above Mediocre v0.23b if you take those out.

The 'individual' result between the two versions of Mediocre was this:
3: Mediocre v0.23b dev 20,5/35 101101111101===0=1001001=10101=011=
5: Mediocre v0.22b 14,5/35 010010000010===1=0110110=01010=100=
Quite close but the check extension seems to give a slight edge.

Many many more games needs to be played for any real conclusions, this is mainly for fun. :)

Feb 17, 2007

[Plan] Many things to do

I have been out of town since thursday (and still is) so I left the computer running a 1000 game tournament between TSCP, King's Out, Roce, Mediocre v0.22b and Mediocre v0.23b (really only check extension added). Getting back tomorrow and looking forward to the results.

I do not know what the statistical value of such a tournament is, but atleast we get 100 matches between each of the engines which should give us some idea of the strength differences.

My guess would be King's Out as clear first with only a few losses, and Mediocre v0.23b on fourth place but with quite some wins against TSCP and Roce. We will see tomorrow though. :)

Still some problems with basic running

Mediocre was signed up on the wbec-ridderkerk site and they had some problems running it. Apparently Mediocre did not like playing black.

I have not had any problems at all with this however.

There is also a french site (which I forget the name of) where Mediocre is listed as 'faulty', giving errors when played in the shredder and chessbase software.

To be honest I have no idea what Mediocre is doing trying to play tournaments on those sites. :) I do not consider Mediocre to be ready for any kind of 'official' tournaments yet. But the problems they had with it were not expected since I have never run into them.

From what I have gathered the shredder and chessbase software does something Mediocre does not recognize. I do have a copy of chessbase somewhere, and shredder has a 30 day trial period so I should take a look at those and see what I can do to fix it.

Outline for the next version

I have been reading some on the Talk chess forums and picked up a few simple things I want to implement (more below).

But the main goal of the next version is a complete rewrite of the evaluation, this is where the most can be done to improve Mediocre at the moment.

Mediocre runs quite fast right now, with an acceptable amount of nodes, and an equally acceptable amount of time used for each node.

This does not help however if we actively search out bad positions. When letting Mediocre play games against other engines, and especially against older versions of itself I see this happen over and over again.

The newer version searches 3-4 plies as deep and decides a line where it ends up with trippled pawns and an open king is the best position.

As the search depth gets deeper the tactical tricks are not so important anymore. If one version searches 7 plies and the other 10, the faster version of course has an edge, but it is not as noticeable as 3 vs. 6 plies. So more plies is not the solution anymore.

Anyway, here are a few things I am going to work on.
  • Dynamic time control check

    Currently we set a fixed amount of alpha-beta nodes to search before we check if the time is up or if the user told us to stop thinking. For a few versions this number has been 2000. If we set this number too high we get overruns, that is we keep searching beyond the given time by a certain amount. This is of course not good, especially if we are very low on time.

    I have had some bugs with this that I do not really understand, but the number needs to be well adjusted.

    A better solution is to keep this number dynamic, meaning it changes depending on how much time is actually used between the nodes. We could start with say 2000 nodes per time check, but if those 2000 nodes take 10 seconds to calculate we will get in trouble. Also if those 2000 nodes take 1 millisecond to calculate we will run the check for time way too often.

    So we keep track of how much time we use between the nodes and adjust the time check accordingly. This is a quite simple thing to implement, and should make Mediocre rely less on the hardware it is running on.

  • Piece tables

    A piece table simply keeps track of the pieces positions of both sides so we do not have to look for them in a big loop.

    This has been suggested to me for quite some time. In v0.22b I let the Board-class keep track of the kings' positions to eliminate the loop for finding them in the inCheck methods. I am sure this saved some time, but we should extend this to include all the pieces.

    There are three important places where this would be very useful.

    The move generation where we loop over the whole board to find a maximum of 16 pieces to generate moves for (the pieces of one side).

    The isAttacked method where we do the same loop to find the pieces of the attacking side (this is one place where we used to loop to find the king as well).

    And in the evaluation algorithm where we loop to find the pieces for evaluation.

    What I plan to do is keep a 32 slot array in the Board-class which keeps track of the piece positions of both sides. 0-15 is black pieces and 16-32 white pieces. 0-7 black pawns and 16-23 white pawns, 15 black king, 32 white king and so on.

    The array is filled at the start of the game and with every makeMove we update it for the new positions. If a piece gets captured we put -1 there to indicate the piece no longer exists.

    Now when we want to generate moves for white pieces we only need to loop 16 times, index 16-32 in the piece-array.

    This should save quite some time and is very easy to implement.

  • PickSort or similar

    Currently Mediocre has a very messy move-sorting procedure. First we call sort() which actually does not sort the moves but only assigns them a sorting value based on being a hash-move, killer-move, capture etc. Then we call bubbleSort that sorts the moves in the assigned order.

    This is a very slow way to go about it. With some redesigning we could easily avoid the sort()-method altogether, but even better we could even avoid the bubbleSort.

    Instead of sorting all the moves we could do a 'PickSort', we simply go over the moves one time and grab the move with the highest ordering value. First time around this should be the hash move, most of the time this move will cause a cutoff so we do not have to bother checking the rest of the moves, hence sorting them beforehand is a waste of time.

    If the first move does not cause a cutoff we do the same thing again and grab the move with the second highest ordering value (should be primary killer move).

    And so on. We have then eliminated unnescessary sorting and should speed up the search quite a bit.

    The only problem I can see with this is the problematic assigning of ordering values. Where should we award the values if we do not use the sort()-method? I will come up with something. :)

  • Evaluation

    This is the main subject of this version. I will do a complete rewrite of the evaluation algorithm. And there are three major areas I will concentrate on.
    • Piece tables - We need a lot better piece placement tables, involving all the pieces
    • King safety - I have looked alot at Rebel's king safety evalution and it makes a lot of sense. Hopefully I can find a way to implement this efficiently. Maybe we even get a SEE (static exchange evaluation) out of it which we can use for quiescent search
    • Pawn evaluation - This along with king safety is where Mediocre lacks the most at the moment. We need to spot passed, doubled, isolated, and backwards pawns. There is more but I think I will start with this

  • Interface problems

    As mentioned above. I need to get Mediocre working correctly on all types of software. Perhaps we need to implement a hash size function as well to cover everything, we will see

That is about it. There are a lot of things to do, but really it is only the evaluation that should challenging. The evaluation will continue to grow over time, but the things listed above is a good start.

Feb 15, 2007

[Bug] Wrong node count

In some complicated positions Mediocre seemed to drop its node count way below the average. I never really figured out why since the type of position should have nothing to do with the number of nodes visited per second. It could affect the number of plies, but that is another matter.

One position I had this problem with was this one:
2rq1rk1/p5bp/bp2pnp1/n1ppNp2/2PP4/PP1BPN2/1B3PPP/2RQ1RK1 b - - 3 14

It is a fairly complicated position with many captures going on. Mediocre searches it like this:








Ply Eval  Time  Nodes Line
1 -15 140 36 Bb7
2 -25 219 179 Bb7 Qc2
3 -15 875 678 Bb7 Qc2 Nc6
4 -15 1156 2040 Bb7 Qc2 Nc6 Rb1
5 -10 4906 9434 Bb7 Qc2 Nc6 cxd5 exd5
6 -15 11359 29397 Bb7 Qc2 Nc6 Nxc6 Bxc6 Ne5
7 -10 53578 152535 Bb7 cxd5 Qxd5 b4 Qa2 Qe2 Nb3
As you can see it takes quite some time to get to 7 plies depth. But the node count is only 152535. That kind of node count takes about a second in most positions, and not 53 seconds like here.

I could never really figure out why this was. Of course it has something to do with quiescent search since that is the only place the engine could spend that kind of time without getting farther in the search.

And then today I was looking through the source code of some other engines and found a missing line.

Mediocre only increases its node count at the beginning of every alphaBeta()-call. But even though quiescent search has a limited number of moves to choose from, every position it finds is also a node, so of course the number of nodes should be increased there as well.

So I made quiescent search nodes add to the total nodes searched as well and the result was this in the same position.
Ply Eval  Time   Nodes Line
1 -15 140 7484 Bb7
2 -25 218 19992 Bb7 Qc2
3 -15 687 97129 Bb7 Qc2 Nc6
4 -15 953 140802 Bb7 Qc2 Nc6 Rb1
5 -10 4234 674731 Bb7 Qc2 Nc6 cxd5 exd5
6 -15 10390 1675138 Bb7 Qc2 Nc6 Nxc6 Bxc6 Ne5
7 -10 50812 8200247 Bb7 cxd5 Qxd5 b4 Qa2 Qe2 Nb3
Exactly the same evaluation and line, and time used is about the same (this differs some from search to search of course). But the node count is fifty times as high.

Conclusion

This does not affect the search speed, or evaluation, or anything else. But it does give us a much better view of how much time Mediocre uses to go through a certain amount of nodes.

The pruning of Mediocre was appearing to be quite awesome, and the nodes per second terrible. With this change we can see that Mediocre is not so bad when it comes to pruning (but not -that- good :), and it is also not as slow calculating nodes as it would have appeared.

As said this is only a matter of presentation and has nothing to do with the search itself. However comparisons to other programs will become much easier now.

Feb 13, 2007

[Other] A little tournament

I ran a tournament between all the versions since 0.12b (since that is when time management was implemented).
   Engine              Score   v.22 v.23 v.21  v.2 v.12
1: Mediocre v0.22b 13,5/16 ···· 1101 11== 1=11 1111
2: Mediocre v0.23b dev 11,0/16 0010 ···· 01== 1111 1111
3: Mediocre v0.21b 7,0/16 00== 10== ···· 1110 1000
4: Mediocre v0.2b 5,5/16 0=00 0000 0001 ···· 1111
5: Mediocre v0.12b 3,0/16 0000 0000 0111 0000 ····
The v0.23b dev is the development version, it has some adaptive null move pruning and check extensions but it is not working like it should yet.

Most of the draws were due to unnescessary repetitions.

One interesting aspect is the amount of repeated games. Even in this few games there were a few repeated games, i.e. games with the exact same moves from start to end.

Medioce has no random factor and I do not intend to add one, it plays what it considers best in every position. The way to get more variation in the games is improving the opening book.

I might add the posibility to use a common opening book through Arena (or other interfaces that support it), however I like Mediocre having its own book as well. Perhaps it is time to start building a bigger one.

[Other] Letting Mediocre loose on the ICC

I decided to try Mediocre against some human opponents so I dug up my computer account on the Internet Chess Club and let it have a go.

The name of the account is Independence, so if you have a user at ICC you can give Mediocre a try there, I will leave it up for a while.

The result after 12 hours was this:
       rating win loss draw total best
Bullet 2051 146 19 22 187 2085
Time control for pretty much all games was 2 minutes without increment. Among the beaten opponents were two FMs (fide master) and a WIM (woman international master).

Mediocre was pretty stable during the whole time, no crashes and a total of 2 illegal moves.

Both illegal moves came after a draw by repetition. Also draw by repetition is recognized only if Mediocre initiates the repetition. I.e. it does the first 'repeating' move.

Basically the repetition detection needs a checkup, but for everything else Mediocre is looking just fine. :)

Feb 12, 2007

[New Version] v0.22b - New move and transposition table representation, and general optimizations

Changes:
  • Fixed a bug in the isAttacked method that made it do unnescessary checks
  • Changed the way unmake move works. The Board-class now has a history array where it keeps track of passed positions instead of using the Move-objects
  • The Board-class now keeps track of passed Zobrist keys so we do not have to 'unmake' them when unmaking a move
  • The alphaBeta() does not generate moves at leaf nodes anymore
  • Changed ordering values for hash moves and killer moves so they always come first and second (at high depths the history moves sometimes came first)
  • Fixed an issue with reporting time using the UCI protocol
  • The Board-class now keeps track of king positions so we do not need to loop through the boardArray to find them
  • Changed move generation to use arrays instead of Vectors. generateMoves() now fills a given array from a certain index and returns the number of moves that were added
  • Alpha-beta and quiescent search now uses a common big array for storing generated moves instead of creating many small local arrays
  • Moves are now represented by an integer instead of a Move-object
  • The Move-class was completely rewritten and now only contains static methods for analyzing the new move integer
  • Fixed yet another issue with en passants (hopefully they work now)
  • The transposition table now uses an integer array instead of using Hashentry objects. Many new methods for retrieving different values were added

mediocre_v0.22b

[Other] A quick comparison v0.21b vs v0.22b

I finally finished with the new move and transposition table representation along with a number of other optimizations. I will put the new version up for download tonight.

I am still not sure what the optimal size for the transposition tables is, and it can not be changed by the interface yet. I will leave this for a later version though.

A quick comparison

I made a simple comparison to see how much speed Mediocre has gained from these changes. By simply letting the both versions calculate the starting position for 9 plies we get a decent idea of the differences, note however that the new Mediocre can use a larger transposition table and will probably be even faster further into the game.

Mediocre v0.21b
Ply Eval   Time   Nodes Line
1 20 109 21 Nc3
2 0 140 107 Nc3 Nc6
3 20 156 223 Nc3 Nc6 Nf3
4 0 203 728 Nc3 Nc6 Nf3 Nf6
5 7 281 2066 Nc3 Nc6 Nf3 Nf6 d4
6 0 625 5940 Nc3 Nc6 Nf3 Nf6 d4 d5
7 15 1922 25719 Nc3 Nc6 Nf3 Nf6 d4 d5 Be3
8 0 7015 81963 Nc3 Nc6 Nf3 Nf6 d4 d5 Be3 Be6
9 19 136609 1085895 e4 Nf6 e5 Nd5 Nf3 Nc6 Bc4 Nb6 Na3

Total time: 2:16.609
Mediocre v0.22b
Ply Eval Time   Nodes Line
1 20 47 21 Nc3
2 0 79 107 Nc3 Nc6
3 20 79 223 Nc3 Nc6 Nf3
4 0 94 728 Nc3 Nc6 Nf3 Nf6
5 7 125 2066 Nc3 Nc6 Nf3 Nf6 d4
6 0 172 5940 Nc3 Nc6 Nf3 Nf6 d4 d5
7 15 329 25406 Nc3 Nc6 Nf3 Nf6 d4 d5 Be3
8 0 938 79321 Nc3 Nc6 Nf3 Nf6 d4 d5 Be3 Be6
9 19 19235 1176165 e4 Nf6 e5 Nd5 Nf3 Nc6 Bc4 Nb6 Na3
Total time: 19.250
The new version is almost 2 minutes faster to 9 ply. A remarkable difference really.

Notice how the number of nodes differ slightly, this evens out over time though, and the changes I made to hash and killer move values makes the new version visit less moves later in the game.

The expected lines are exactly the same though, so we know I did not mess up anything when implementing this.

The important thing is the nodes per second here though. Clearly Mediocre has improved tremendously in that area. :)

Looking ahead

I think it is time we start working on the evaluation now. What is the use to search deep if you do not know what you are searching for? :)

Also a few extensions, like check and singular response extension, would be nice.

Further on I will start working with the repetition detection again, we should be able to gain some speed there as well from cutting off drawn repetition lines.

Feb 11, 2007

[Guide] The memory in the hash tables

Having optimized some more, Mediocre is now searching much faster than the previous version while keeping the node count for each ply intact. Meaning it generates the same moves for a certain fixed ply, only much faster. Just like my goal was.

For the first time Mediocre now searches so many nodes in a (short) game that the transposition table fills up, this was never an issue in the old versions.

How much memory do we use

To be honest I had no idea how much memory the transposition table would take when filled. So some counting is needed.

A hash entry in Mediocre consists of:
long zobrist       -> 64 bits
int depth -> 32 bits
int flag -> 32 bits
int eval -> 32 bits
int ancient -> 32 bits
Move move
int pieceMoving -> 32 bits
int fromIndex -> 32 bits
int toIndex -> 32 bits
int capture -> 32 bits
int moveType -> 32 bits
int[] prevpos
int en passant -> 32 bits
int wcastling -> 32 bits
int bcastling -> 32 bits
int halfmoves -> 32 bits

Total -> 480 bits
And if we have an entry at each slot:
Slots (2^20)       -> 1048576
2^20 * 480 -> 503316480 bits
Total -> 60 mb
So at full load the transposition table with 2^20 slots (like Mediocre has) would take 60mb of memory.

Similarly the repetition detection table (with 2^16 slots) takes 3.75mb memory.

The standard max memory allocation for a Java application (that is without setting a -xmx at startup) is 64mb. So as you can see when adding some basic memory usage for the application itself we exceed the memory.

This resulted in severe slowdowns at the end of the games and Mediocre crashing due to heap size being exceeded afterwards.

To resolve this issue the easy way I simply scaled the transposition down one 'size' to 2^19. This makes the transposition table take less than 32mb and we will never be close to exceeding the total (64mb) memory.

Optimizing the memory usage

While we can solve the issue like mentioned above it is of course not optimal. A larger transposition table means more stored positions and a faster search.

We can use the -xms and -xmx arguments when starting Mediocre, this sets the minimum and maximum memory usage the application gets allocated. This would allow us to use the larger transposition table, but that has really nothing to do with Mediocre itself but rather the hardware it is running on (if we have 2gb free RAM to use, we might as well use it).

What is more important is how we use the memory we get allocated.

As I said above one hash entry currently takes 480 bits of memory. This is by far too much.

This is what I plan for one hash entry:
long zobrist                -> 64 bits
int move -> 32 bits
int depth/flag/eval/ancient -> 32 bits

Total -> 128 bits
And if we have an entry at each slot:
Slots (2^20)                -> 1048576
2^20 * 128 -> 134217728 bits
Total -> 16 mb
Now we only use 16mb for the same table that took 64mb before. If we go up one 'size' to 2^21 slots we use 32mb and 2^22 uses 64mb.

So a bigger table is possible for the same amount of memory.

I explained the move integer in an earlier post, the same routine will be used for the 'depth/flag/eval/ancient' integer. That is certain bits will be used to represent certain information. For instance I will use 18 bits to represent the evaluation (18 bits can hold a number up to 262143 which can hold -100000/100000 which a mate value is worth).

I could reduce the memory usage even more by only using half the zobrist as lock (this is what we use to make sure we have the right position). This would make the zobrist key an integer (32 bits) instead and a hash entry would then take 96 bits instead of 128.

However, as I have mentioned a few times already the Zobrist keys are not entirely unique. There is a risk that two Zobrist keys matches two different positions and if we only use half the key that risk increases.

While the risk is still very small, and even if we accidently use a faulty score somewhere in the search it does not nescessarily mean disaster, it is still a risk and you would have to weigh the risk against better memory usage. I decided to use the full Zobrist.

What is left for the next version

I have not yet started to implement the new move representation, which in turn is needed for the new transposition table entry representation. On the other hand I am pretty much done will all the preparations for a smooth transition.

Hopefully I will avoid all the nasty bugs I got last time around and the new version of Mediocre should soon be here. :)

Edit:
17 bits will not be enough for evaluation since we need negative values as well. 18 bits is needed.

Feb 10, 2007

[Other] Going about it the wrong way

When starting to work on this version of Mediocre my goal was to keep the node count intact while improving the nodes per second. That is not changing the actual calculation at all, only speed it up.

I wanted to implement the new move representation and remove all slow Vector calls.

When I started looking through the code I found a whole bunch of badly coded details so I changed them as I went along.

I also had to change things like unmake move to make the new move generation possible.

The result was a version filled with bugs which actually was not running faster since the node count was much higher (for some reason) and made weird moves and crashed from time to time.

I managed to track down a few of the bugs, for example when running the UCI version the history array in the Board-object kept overflowing, this was due to UCI replaying the moves on the board, but it did not reset the history index.

But new bugs kept showing up.

So I decided to start from the beginning again and wait with the new move representation until everything was in place for it to work.

I have now fixed the issue with the isAttacked-method, written a better capture generation algorithm, implemented the new history array for the Board-class, changed the sort values for hash and killer moves, and moved the move generation in alpha-beta so it does not generate moves on leaf-nodes.

These simple changes has made a tremendous difference. And most importantly they work without bugs.

The next step is replacing the Vectors with arrays and make them work flawlessly, and finally, when everything else is in place, implement the new move representation.

While the move representation might not speed up the engine quite as much as I had thought (the other changes are more important) it -will- be very important for the size of the transposition table, so sooner or later I will have to do it.

Lesson to be learnt; keep your changes structured and do not do it like I did. :)

Feb 8, 2007

[Other] Good and bad

Searching from the starting position the new Mediocre reaches 7 ply in about 7 seconds, having calculated about 835,000 nodes.

Mediocre v0.21b reaches 7 plies in 1.5 seconds having calculated only 22,000 nodes. After 7 seconds v0.21b has reached a bit more than 8 plies and calculated about 100,000 nodes.

The conclusion is the new Mediocre is lightyears faster than the old version, perhaps not 10x faster (like the numbers suggest since there are other factors than the raw node count) but atleast 7-8 times faster.

Unfortunately somewhere along the way I severly messed up the move ordering, so the new Mediocre has to go through hundreds of thousands more nodes to reach the same depth.

This is obviously due to a bug somewhere in the code, I just have to find it.

But these results still look very promising.

Feb 7, 2007

[Guide] Another unmake move scheme

I am currently hunting bugs in the new version of Mediocre. It is actually quite fun since while going through the old makeMove/unmakeMove and move generations methods I notice tons of things to optimize.

To not leave the blog idle like I did when struggling with the transposition tables I thought I would explain one of the new features I implemented, namely the new handling of unmaking moves.

Why a new unmake

To unmake a move we need a bunch of information. Things like what piece was moving from where to where, and what piece it captured (if any) we already have in the new move integer (explained earlier).

Previously I also stored information about the previous position's en passant square, castling rights and number of moves without pawn push or capture (for fifty move rule) in the Move-object itself.

But with my new move representation I could not fit the nescessary information about passed positions in the move itself (as mentioned earlier) so I had to come up with something else.

I got the following idea from Tom Kerrigan's TSCP.

The new scheme

Since we can not store the nescessary information in the move anymore we need to store it somewhere else.

The logical place is in the Board-object, but since we use one Board-object throughout the whole search, how should we store information about passed positions so we can keep track of what to 'remake' for a certain move?

If you think about it is not so easy. We could have variables like 'previousEnpassant', 'previousWhitecastling' and so on, but this will only keep track of the position one move back.

So if we make 10 moves, and then start unmaking them one by one, the 'previous'-variables will be reused several times before we come back to the original position and we would have no way of retrieving the original variables.

The solution is keeping an array of passed position variables.

At creation time the Board-object's 'history'-array will be empty.

If we make one move we add the values we need stored (en passant, castling and half-moves) to the history-array at index '0' and keep track of where we currently are (e.g. increment a 'historyIndex' integer with 1).

When we unmake the move we decrement the historyIndex with 1 and look at that index (in this case '0') and we find the values we wanted.

The benifit of this solution is we can make 10 moves in a row, which will result in 10 indexes being filled in the array, and every time we unmake a move we will find the right values in the array.

This also works if we make 10 moves, unmake one and make another. As long as we are only unmaking one move we will be alternating between index 8 and 9 here, which is of course what we want (we want the earlier values intact until we get to unmaking them).

How to store the information

I used the same scheme for storing information in the history-array as I did with the move integer.

That is use an integer and update the right bits in it with the particular information. This way we only need one array (and not one for each type of information) and we do not have to create some sort of history-object.

The code is very simple and looks like this:

In the Board-class:
public int[] history;
public int historyIndex;
In the makeMove()-method:
history[historyIndex] = 0;
if(enPassant != -1)
{
history[historyIndex] =
enPassant;
}

history[historyIndex] = history[historyIndex] |
(white_castle << 7)
| (black_castle << 9)
| (movesFifty << 16);

historyIndex++;
And in the unmakeMove()-method:
historyIndex--;

if(((history[historyIndex]) & 127) == 0)
{
enPassant = -1;
}
else
{
enPassant = ((history[historyIndex]) & 127);
}
white_castle = ((history[historyIndex] >> 7) & 3);
black_castle = ((history[historyIndex] >> 9) & 3);
movesFifty = ((history[historyIndex] >> 16) & 127);
Notice how my constant for 'no' enPassant (-1) is giving me trouble again. We can not store -1 in the integer since it would result in a weird value. I really have to do something about this soon.

But all in all it is very easy and works like a charm.

I also added a separate array for the zobrist keys that works the same way (but with 'long' type to fit the keys). This is not mandatory since we can reset the zobrist key 'manually' while unmaking the move. But it saves some time and makes the code in unmakeMove() clearer (thanks Patrice for pointing this out).

Well that was a short idea of how to make unmaking a bit easier. Now it is back to bug hunting for me. Wish me good luck. :)

Edit:
While writing this guide I noticed one of the bugs I was hunting. When making a move we need to set the history value at the current index to '0' before attempting to update it. If we do not we might be doing bit operations to an existing key which will result in weird results.

Feb 6, 2007

[Other] Compilations and logo

At this homepage Jim Ablett has been compiling my releases of Mediocre for Windows and Linux for quite some time. I decided to add these to my blog for download since they do not require Java to run.

Be aware that there can be some problems with them, since there are some tricks for compiling Java like that. Anyway there they are and if they work great, if they do not you will just have to use my usual Java binaries.

Jim also created a nifty logo for Mediocre which I rather like, so I put it up as well. :)

[Other] Good suggestions and protected e-mail

I have received a few good suggestions to how I can improve the speed of Mediocre. I have also learned a lot about optimizing Java while researching myself.

Many of the things I have changed are very program specific. I do not know how much use it can be for other programmers other then the general things like not allocate too many objects etc. I might try to put together a short list of suggestions what to look for in the code, but generally it is up to every programmer to find these things in his own code.

A thing like the isAttacked()-method not checking for color of the piece before doing the calculations, is more of a bug than an actual general improvement.

Anyway I am going to finish up with the improvements (right now quiescent search and move sorting are broken due to the new move representation) and then release a (hopefully) much faster version of Mediocre.

On another note I have been receiving many very good suggestions through e-mail, along with unwanted spam that is starting to add up. So I for now use a thing called 'spambutcher' instead of printing my e-mail address in clear text. Just follow the link and enter the phrase and you get my e-mail address. This is to prevent automatic gathering of e-mail addresses.

Hope it is not too much of a hassle. :)

[Other] Finally something that made a difference

I kept reading through the move generation code over and over and finally I noticed something significant. In the isAttacked()-method Mediocre was not checking if the piece on the square was of the right color before determining if it could attack the square.

This was caught further down in the code when pieces of the wrong color never matched the attack pattern, but the harm in extra calculating time was already done by then.

This line of code:
if(pieceOnSquare <= 0) continue;
cut another 10-20% off the generation time.

The new times for perft are still not great, but they are atleast getting better:

Depth    Nodes     Time
1 20 0.000
2 400 0.000
3 8902 0.047
4 197281 0.312
5 4865609 7.750
6 119060324 3:23.094

Feb 5, 2007

[Other] Mediocre improvements

I got the new move representation in place, but the gains seems quite limited (atleast in terms of move generation speed, but it will do wonders in memory conservation for the hastables). The times for my perft for the first few depths are as follows:
Depth    Nodes     Time
1 20 0.000
2 400 0.000
3 8902 0.047
4 197281 0.422
5 4865609 10.219
6 119060324 4:24.047
It is about 10% faster than the old routine, but still almost a full ply slower compared to other engines. I do not know what I might be doing wrong. I have removed all slow 'new' commands, that is replaced pretty much every object creation there was. Also I switched all Vector objects to arrays, and even stopped creating a local array with moves for every ply and instead used one larger array used for all depths (with limited benefits).

I have no idea what is going on. The problem is not the Java code, since other Java engines are faster. And this is just a huge difference.

Any ideas of what can be done to speed up the move generation are greatly appreciated.

Feb 3, 2007

[Guide] New move representation

It is time to start looking at ways to improve the general performance of Mediocre.

All the changes I will be doing for the next version will be fixing general performance issues. Basically Mediocre will return the exact same moves for a certain ply, only that it (hopefully) will reach the ply much faster.

Here are a few things I will be looking at.

Vectors

I have been using the Vector-class (located in the Java library) for storing moves in different forms. While the Vector-class is very easy to use, it is also slow. Much slower than using an array.

I will be changing the code wherever I have used Vector and use an array instead. This will make the code a bit more awkward, but speed it up a notch or two.

A new move representation

This is where I think the main speed gains can be found. Up until now I have used the Move-class to create objects representing moves. Since Java is an object-oriented language this is the preferred way for simplicity and good encapsulated code (it is not like I encapsulate anything anyway :) ).

But a chess program needs speed more than good looking code, and creating an object takes time.

I was actually warned about this in a comment back in december when I started writing Mediocre, but I think I chose the right path. First create a working program that is easy to understand, and then (now) start making optimizations to the working code.

I plan to represent the moves with an integer. An integer has 32 bits (ones and zeros) and I will use different bits to represent different things about the move.

To represent a square on the 0x88 board we need 7 bits since that can hold number between 0 and 127. So the 'to' and 'from' indexes needs 7 bits each.

To represent a piece we need 4 bits, which can hold a number between 0 and 15. (6 white and 6 black pieces + empty square = 13 so it is enough) This will be used to hold what piece is moving and what piece is captured.

We also need 3 bits for the movetype. 3 bits can hold a number between 0 and 7, and that is exactly the number of move types we have (ordinary, en passant, long and short castling, promotion to queen, rook, knight and bishop).

So we need; 7+7+4+4+3 = 25 bits which fits in the integer's 32 bits. Which I illustrate in the image.

Storing valuesin the int

To store values we use the shift << and or | operators. This is done like this:
int toShift = 7;
int pieceShift = 14;
int captureShift = 18;
int typeShift = 22;

int move = -1;

move =
(from)
| (to << toShift)
| (piece << pieceShift)
| (capture << captureShift)
| (type << typeShift);
We 'shift' the values to the right position and then use 'or' which changes the right bits to 1 instead of 0. It could look like this:

Let us say we want to add to=54 to our 32-bit integer.
54 = 110110 (in binary from)
And we have a long row of zeros we want to add it to.
...0000000000000000000000000
Now we shift the 54 to the right position. That is we add seven zeros behind it like this:
1101100000000
And now we 'or' it to the row of zeros:
   ...0000000000000000000000000
OR ...0000000000001101100000000
= ...0000000000001101100000000
If we want to add type=5 (101). Shift it to the right position (i.e. add 22 zeros behind it):
1010000000000000000000000
And then 'or' it to the integer:
   ...0000000000001101100000000
OR ...1010000000000000000000000
= ...1010000000001101100000000
So now we have the type in the right position and so on for all the other values. The 'from'-position is in the beginning so we do not need to shift it.

Retrieving values from the int

Now we have a long row of 1's and 0's. In decimal form it looks cryptic (a high random number like 9698192 or so), but we do not need to worry about that.

To retrieve a value we do like this:
int squareMask = 127;
int pieceMask = 15;
int typeMask = 7;

from = (move & squareMask);
to = ((move >> toShift) & squareMask);
piece =((move >> pieceShift) & pieceMask);
capture = ((move >> captureShift) & pieceMask));
type = ((move >> typeShift) & pieceMask));
First we shift the piece of the code we want to the beginning. Then we apply a 'mask' and 'and' with it, it could look like this like this to retrieve the to-value:
...10010110110001101011001011
Shift:
...00000001001011011000110101 (the to-values are now first)

And with the mask:

...00000001001011011000110101
AND ...00000000000000000001111111
= ...00000000000000000000110101
We now have the to-value.

It is worth it

This system can be very cryptic at first, and it is easy to get wrong. But on the other hand bitwise operations are extremely fast since the computer is working directly with the bits and do not have to convert them first.

Since we replace the Move-objects with a very fast scheme, the gains should be very good.

The time difference in creating the different types of moves is huge, the scheme I have now explained is almost 20 times faster than using an object.

And that does not include every little time gain we will get from not having to access the values through an object, but instead doing an extremely fast bitwise operation.

The 'downside'

Until now Mediocre has handled unmaking moves through the Move-object itself. The move stored information about the previous position and made unmaking possible.

Unfortunately we can not fit that information in a 32 bit integer, it would require about 49 bits total, slightly less with some optimization.

To fit that information we would have to use the type 'long' which has 64 bits. But firstly, handling the long type is much slower than handling integers, and secondly 64 bits takes twice as much memory which will become important when we start looking at optimizing and resizing the transposition table.

So we will have to find another way to store information about passed positions. There is an easy way to do this which I will explain later.

We have 7 bits to spare

As I showed above it takes 25 bits to represent a move, so we have 7 unused bits. I am thinking about using them to store the ordering value. 7 bits can represent a value between 0 and 127 so it should be enough, but I would have to tweak the move ordering values slightly to fit them (especially from killer moves and the history heuristic).

Work that does not change the engine

I will start making these changes right away. Large parts of the code has to be rewritten, especially the make and unmake-move methods, and also the move generation. I hope to make them more efficient as well.

Hopefully this will speed up Mediocre by a great deal.

Feb 2, 2007

[New Version] v0.21b - Contempt factor and better interface support

Changes:
  • Added support for 'infinite' search in uci (not applicable in winboard)
  • Added support for fixed time per move search for both uci and winboard
  • Added support for fixed depth search for both uci and winboard
  • Added support for the '?' (winboard) and 'stop' (uci) move-now commands, the 'move now' command in certain interfaces now makes Mediocre move immediately
  • Made null-moves only possible if the game is not in a pawn ending
  • Introduced contempt factor to draw values (dependant on game phase)
  • Added a game phase check that decides if the game is in the opening, middle game, ending or pawn ending
  • Added final INFINITY variable instead of using 200000 in the code
  • Cleaned up the line input, currently it can only set up a board with FEN and do perft/divide, might be enhanced again in later releases
  • Increased the number of nodes before time or interrupt command are checked, hopefully this should take care of the bug where Mediocre stops searching
Note: I have commented out a few lines in the alpha-beta method that enables in-check extension. It should be working ok, but I want to do some further testing before releasing it.

mediocre_v0.21b

Feb 1, 2007

[Other] Running into the wall

Up until now I have not considered technical issues when writing the code. Things like allocating a new object takes alot of time.

I started with a rather slow code in general, with many objects getting stored and moving back and forth, and every change I made so far has tried to speed up the code with things like move ordering and reduction of the search tree (transposition tables, killer moves, null-move pruning etc. etc.).

Now when it is time to add things like move extensions, pawn structure evaluation and attack tables, we are starting to slow down the search again.

So before doing that I think we need to speed up the core of the engine.

When doing a 'perft 4' (finding all positions after 4 plies from start position), Mediocre allocates 207064 Move-objects. These objects also have to be collected by the Java garbage collector when they are not used anymore. If we were to use an integer to represent those moves instead we would speed up this process tremendously.

Compared to other engines I have tried, the 'perft' command is about 20-30 times slower in Mediocre on average. Perft only uses generateMoves, makeMove and unmakeMove so it is easy to conclude there are some massive time saving to be found there.

I think it is about time we do something about this. I will finish up with the new changes to the evaluation, so castling and better piece development is rewarded, and also fix a few issues with time management and support for analyzing in winboard and uci. And release v0.21b.

Then start working on optimizing the code, removing allocation of unnescessary Move-objects and overall tweak the performance. This is where real gains can be found at the moment I believe.