EGTB compression

**Alvaro** · 9th November 2014, 00:38

Hi everybody :)

For several years, I've been developing a checkers program as a hobby of mine, I also created an 'End Game Table Base' that
has the theoretical value (win/loss/draw ) of every possible position with 8 or less pieces on the board.
I use 2bits for representing the values of win/draw/loss, so a byte corresponds to 4 different positions.
This 8piece database has 110GB of uncompressed size, and I managed to compress it to 4GB (plus 2 index files) with a simple form of RLE.
But to achieve a high compression ratio I had to replace the values of positions that have captures by either side by the dominant value of a given slice (subdatabase) of this database.
During gameplay my checkers engine skips probing these capture positions since they don't have the correct win/draw/loss values and so the engine just moves ahead.
The database is divided in subdatabases corresponding to the number and type of pieces on the board, and these removed values that help improve compression vary from 50% to more than 90% of the number of positions in these subdatabases, that's why RLE works reasonably good.
Also the probing speed is of paramount importance, since checkers and chess programs must visit each node of the search tree, at rates of millions of nodes per second, if this speed comes down then the playing strength will drop severely.
So a blindly fast decompression speed is a must.
This 8man database fits well in an 8GB computer, but going up a single piece and the database's size leaps to 1,187GB uncompressed, thats a terabyte of data.
So no run length encoding scheme will compress so much I could fit the 9piece db in less than 24GB (wich is my pc's available RAM).
I estimate the 9piece database would be around 40GB compressd plus a couple of GB for the index files.
Now mind you I'm no compression guru, not even a 'newborn' data compression programmer.
RLE was the farthest I could go.
This is why I'm asking for your help, I wonder if you could help me finding a compression scheme that would reduce the database in half,
the 8piece db would have say 2.5GB and the 9man db in less than 24GB.
Now I already found paper on this subject, the problem is that I'm a data compression ignorant.
The paper is:
http://www.ru.is/~yngvi/pdf/SchaefferBBLLS03.pdf
at pages 10-12 it is explained in general terms the compression algorithm.
Sadly I don't even understand what huffman compression is much less the entire algorithm.
I wonder if you guys could explain to me the details of this algorithm as if I was a 5.
Or maybe you have better and more advanced ideas to solve this compression problem.
One specific thing about it is that the database doesn't change so maybe we can tune the decompressor for it in ways I don't know, maybe fixed dicionary decompressors could help.
I would really appreciate the help of the gurus of this forum.
Remember compression time is of no importance, but decompression time is of extreme importance.
In my RLE compressed 8piece db, I even use 512bytes (2048 2bit positions) for the block size, wich in some cases can compress to a few bytes.
The db is intended to be loaded into RAM for the speediest access possible.

best regards
Alvaro

**Alvaro** · 9th November 2014, 01:38

I include 2 sample files, they are 2 among many that compose the 110GB database and they are in uncompressed form so you can have an idea of the data:
https://drive.google.com/file/d/0Bya...ew?usp=sharing

Alvaro

**Kennon Conrad** · 9th November 2014, 03:11

I'm not sure I fully understand the problem. Have you considered trying to keep the database smaller?

Since there are only 2^32 possible combinations of positions for your pieces (no matter how many you have), it seems like it might be possible to use a database based on the 2^32 possible combinations for a total size of 2^30 bytes (1GB) instead of using a database based on 32 possibilities for each piece or 32^N for N=8, N=9, etc. (since that's what gives 110GB for N=8). The problem with using 32^N is that you get a lot of entries that mean the same thing. It doesn't matter whether piece #1 is at position 8 and piece #2 is at position 30 or piece #1 at 30 and piece #2 at 8. If you are satisfied with a 1GB database, then you don't even need to worrry about compressing it, so that will give good speed.

**nburns** · 9th November 2014, 05:32

Originally Posted by Kennon Conrad

I'm not sure I fully understand the problem. Have you considered trying to keep the database smaller?

Since there are only 2^32 possible combinations of positions for your pieces (no matter how many you have), it seems like it might be possible to use a database based on the 2^32 possible combinations for a total size of 2^30 bytes (1GB) instead of using a database based on 32 possibilities for each piece or 32^N for N=8, N=9, etc. (since that's what gives 110GB for N=

. The problem with using 32^N is that you get a lot of entries that mean the same thing. It doesn't matter whether piece #1 is at position 8 and piece #2 is at position 30 or piece #1 at 30 and piece #2 at 8. If you are satisfied with a 1GB database, then you don't even need to worrry about compressing it, so that will give good speed.

Bear with me, I haven't played in a while. There are 32 squares that can hold pieces, right? If there were exactly 8 pieces on the board, each player had at least 1 piece on the board, and we simplify a bit by ignoring kings, then (if I got this right) the number of possibilities would be

Code:

(32 choose 8) * (2^8 - 2)

.

* First, choose which 8 squares are occupied on the board, then, from only the occupied squares, for each piece pick which of the players owns the piece (subtracting from that the cases in which one or the other player has zero pieces).

Following along similar lines, you could find an expression for each possible number of occupied squares and sum them, forming one big expression. I haven't actually attempted this, but AFAIK it would work. There could be shortcuts that give shorter combinatorial formulas.

Once you had counted all the possibilities, you could in principle come up with a scheme by which you could rank them (map a board configuration to an integer) and unrank them (map an integer back to a board configuration), and that way you could ensure that no bits at all are ever wasted -- it would be theoretically optimal (for configurations chosen randomly).

But being perfect wrt storage might complicate and/or slow things down and, unless there happens to be a clever way to do it, you'd usually seek a good compromise for the sake of simplicity and execution speed.

**biject.bwts** · 9th November 2014, 07:48

In many ways the problem of checkers is no longer of mystery. Its been completely solved years ago. The perfect game is a draw no one wins.

I'm saying this from the prospective you never want to lose. So instead of using three states for each position you really only need 2 LOSS or NO-LOSS that way only one bit needed. Know of you want to play for the win against a bad player you need to look only a few steps ahead of which there is not many. And possibly pick a path where opponent has a chance of going astray.

If your playing against the best machine in the world. You will never have a LOSS. and the god like machine will never beat you. Just a thought.

**Kennon Conrad** · 9th November 2014, 08:36

I was sleepy and didn't understand the problem so my first post is worthless. The pieces can be for either player....

The 110 GB sounds about right. You would have to add up all the combinations for the black/red distributions for 1/7 to 7/1, using something like 32*31*30*29*28*27*26*25/(4! * 4!) for 4 black and 4 red pieces (! = factorial).

In terms of compression, I would still start by looking at the database. The more the addressing scheme (data order) correlates with the win-tie-draw result, the easier it will be for the compressor to compress the database. I think it would help to compress the number of pieces per side subsets separately, and to try to organize the subsets by favorable vs. unfavorable positions if a simple heuristic is available.

For compression, you could try distance coding the wins and losses, or even just the losses if you count draws as wins. I have no idea if that would be better than RLE, or exactly what your RLE scheme is. Compression may not be ideal without some preprocessing because most compressors are not oriented toward 2 bit data values.

**Matt Mahoney** · 9th November 2014, 20:34

Originally Posted by Kennon Conrad

I was sleepy and didn't understand the problem so my first post is worthless. The pieces can be for either player....

The 110 GB sounds about right. You would have to add up all the combinations for the black/red distributions for 1/7 to 7/1, using something like 32*31*30*29*28*27*26*25/(4! * 4!) for 4 black and 4 red pieces (! = factorial).

I get 2,671,648,200, which ought to be a lot smaller than 110 GB. Anyway for fast decompression you probably want to use LZ77. Huffman codes are explained in http://mattmahoney.net/dc/dce.html#Section_31 but you might find it faster to use fixed length codes that only approximate a Huffman code.

Edit: interesting article on solving checkers in 2007 and proving that the game ends in a draw. There are 5 x 10^20 possible positions for all pieces. http://spectrum.ieee.org/computing/s...heckers-solved

**nburns** · 9th November 2014, 21:41

Originally Posted by Kennon Conrad

I was sleepy and didn't understand the problem so my first post is worthless. The pieces can be for either player....

The 110 GB sounds about right. You would have to add up all the combinations for the black/red distributions for 1/7 to 7/1, using something like 32*31*30*29*28*27*26*25/(4! * 4!) for 4 black and 4 red pieces (! = factorial).

n choose r = n!/(r! * (n-r)!)

32 choose 8 = 32!/(8! * 24!) = 32*31*30*29*28*27*26*25/(8!) ...

So your formula has some definite resemblance to mine, but I can't seem to quite follow your thinking all the way through (granted, I could have tried harder, but you didn't explain too well).

**nburns** · 9th November 2014, 21:43

Originally Posted by biject.bwts

In many ways the problem of checkers is no longer of mystery. Its been completely solved years ago. The perfect game is a draw no one wins.

Hence why I don't bother playing.

**Kennon Conrad** · 9th November 2014, 21:45

Originally Posted by Matt Mahoney

I get 2,671,648,200, which ought to be a lot smaller than 110 GB. Anyway for fast decompression you probably want to use LZ77. Huffman codes are explained in http://mattmahoney.net/dc/dce.html#Section_31 but you might find it faster to use fixed length codes that only approximate a Huffman code.

Edit: interesting article on solving checkers in 2007 and proving that the game ends in a draw. There are 5 x 10^20 possible positions for all pieces. http://spectrum.ieee.org/computing/s...heckers-solved

Yes, you are right. That's what I get for not following through on the math. 736,281,000 for the 4 black/4 red combinations, 589,024,800 for each of the 5/3 combinations, 294,512,400 for each 6/2 combination, and 84,146,400 for each 7/1 combination. 2,671,648,200 combinations would only be about 668 million bytes, so the reason for 110 GB for the described scheme is not clear. It seems like the unbalanced combinations should be very compressible. I imagine all the 7/1 combinations are all wins for the color with 7 pieces.

The 5 x 10^20 "situations" the paper mentions must include more information than just the black/red/vacant status for each position. There are only 3^32 combinations with unlimited black and red pieces, and that's smaller than 5 x 10^20. I imagine the "situations" the paper describes includes who's turn it is and which pieces are kings.

Edit: Here's a link to the "Checkers is Solved": http://disi.unitn.it/~montreso/asd/docs/checkers.pdf. It shows the number of "positions" for each number of pieces and the 5 x 10^20 total. The sums from pieces 1 - 8 is about 440 trillion, which would be 110 GB if each consumed 2 bits. They are clearly including the king status, but I don't trust their numbers. The paper shows 120 positions for 1 piece. There are two possiblities for color, 32 possible positions for kings, and 24 possible positions for non-kings, giving a total of 112 positions for 1 piece.

**Kennon Conrad** · 9th November 2014, 22:00

Originally Posted by nburns

n choose r = n!/(r! * (n-r)!)

32 choose 8 = 32!/(8! * 24!) = 32*31*30*29*28*27*26*25/(8!) ...

So your formula has some definite resemblance to mine, but I can't seem to quite follow your thinking all the way through (granted, I could have tried harder, but you didn't explain too well).

I think you are calculating the number of possiblities for 8 pieces of a single color to fit on a board. The formula I gave was for 4 black and 4 red pieces, or (32 choose 4) * (28 choose 4).

**nburns** · 9th November 2014, 22:19

Originally Posted by Matt Mahoney

I get 2,671,648,200, which ought to be a lot smaller than 110 GB. Anyway for fast decompression you probably want to use LZ77. Huffman codes are explained in http://mattmahoney.net/dc/dce.html#Section_31 but you might find it faster to use fixed length codes that only approximate a Huffman code.

Edit: interesting article on solving checkers in 2007 and proving that the game ends in a draw. There are 5 x 10^20 possible positions for all pieces. http://spectrum.ieee.org/computing/s...heckers-solved

Recreating formulas and solving from first principles and gameplay rules -> smart

Looking up already-solved answer and citing paper -> smarter

To frame this discussion a little better: There are two kinds of compression in play here. I'm not sure how formal I can make this, but one you could classify loosely as coding and the other you could describe as entropy-based compression. The coding part assumes that all board configurations have equal weight and the goal is simply to design a code that assigns each one a short code-word. The entropy-based compression part might use, for example, standard compression algorithms like LZ77 or Huffman, and from a formal, technical standpoint what you're doing is trying to give shorter representations to board configurations that you consider likely and longer representations to ones you consider unlikely.

Given your background, some of what I wrote might be hard to make sense of and see how it applies to this situation. E.g., why talk about board configurations having different probabilities? Aren't you listing them all anyway? Well, yeah (IIRC). The short explanation is that entropy-based algorithms talk about probabilities because they weren't designed for this situation. But they are still a possibility here and you could try them.

**nburns** · 9th November 2014, 22:29

Originally Posted by Kennon Conrad

I think you are calculating the number of possiblities for 8 pieces of a single color to fit on a board. The formula I gave was for 4 black and 4 red pieces, or (32 choose 4) * (28 choose 4).

32 choose 8 is the number of ways to place 8 pieces of one color on a board, yes. My original post has expressions for counting things that are more interesting. I was trying to find the intersection between your math and mine.

You're right -- what you gave is the number of boards with exactly 4 red and 4 black pieces.

In my post I was trying to count the number of boards with exactly 8 pieces, with both red and black having at least one piece.

**Alvaro** · 9th November 2014, 22:53

Thanks all for your interest.
biject.bwts:
The problem is that checkers has several variants, mine is the spanish/portuguese variant in wich the kings are flying kings that can move many squares.
So this variant isn't solved and I even won't atempt to do such an enormous undertaking.
The combinatorial math is all in place and the sizes of the files are correct. Also you have to account that for a given same position you need two win/draw/loss results one for each side to move red and black, this means two files are needed for the same set of pieces on the board (subdatabase).
In my separate EGTB generator I have 2 functions:
void index2position(uint64_t index, position *p) that transforms a number into a board position
uint64_t position2index(position *p) that takes the board and gives an unique index number where I store the w/d/l values.
In this case the board is represented by 4 32bit integers, each bit corresponding to the 1-32 square on the board:
struct position {
uint32_t bk;
uint32_t wk;
uint32_t bm;
uint32_t wm;
};
so I only need 2bits for storing the game theoretical result of a position wich is very compact.
(I do have generated another kind of databases wich takes one byte for each position, but instead of storing win/draw/loss information it stores the number of moves of perfect play from the position on the board until the final winning/loosing position (or a draw), so I can have perfect playing lines for a max 254 moves, these are called DTM EGTBs (Distance To Win) and are used in chess, so the amount of storage is 4 times the amount of WDL databases, so my 6piece DTM dbs are around 2.5GB uncompressed, but this is another story, and since these kind of databases are highly uncompressible I won't invest much time on them)

The uncompressed sizes of my EGTBs match the work of others in this field. The sizes are the same for every 8x8 board with 24 starting pieces, and this is independent of the rules, what matters is piece location/positioning. So my uncompressed sizes match the ones of american checkers that Jonathan Schaeffer and Martin Fierz have done independentely.
The two files I provided represent only 2 files of the 8piece database. I name the files accordingly to the pieces on the board.
(BWT I append a CRC32 for each 1024 bytes at the end of each subdatabase file for hardware failure resistance during the weeks/months of generation time of the database)
The format is Nbk Nwk Nbm Nwm,
Nbk=Number of black kings
Nwk=Number of white kings
Nbm=Number of black men
Nwm=Number of white men
so 2222.db is for 2bk+2bm versus 2wk+2wm
and 4400.db for 4 black kings versus 4 white kings.
Just by adding one piece the size of the db increases exponentially, here are the aproximate sizes for diferent piece count databases:
4piece db 1.5Mb
5piece db 35Mb
6piece db 650Mb
7piece db 9.3GB
8piece db 113GB
9piece db 1,189GB
10piece db 10,764GB

The RLE compression algorithm I use already compresses the 110GB to a bit more than 4GB, but this algorithm won't be enough for the 9piece database wich has a bit more than 1TB in size
So I need a new compression algorithm.

regards,
Alvaro

**Skymmer** · 10th November 2014, 00:20

Hi! Very interesting task I must say, so I started to experimenting with different schemes of compression. Currently for example I can compress 2222.db to 103 MB and it can be decompressed with 600 MB\s rate but requires 10 GB of memory.
Or I can compress to 82 MB and decompression runs at 1029 MB\s and even with lower memory but it takes lot of time for compression. Actually I have 67 MB result but kinda unacceptable due slow decompression.
So could you please describe technical requirements, possible transfer caps, size orientirs, etc.?
Thanks.

**Matt Mahoney** · 10th November 2014, 02:46

I forgot to count kings in my last calculation. The total number of positions for 8 pieces should be (32 choose 8 ) * (2^8 - 2) * (2^8 ) = 683,941,939,200. The first term counts the number of occupied positions. The second term marks each as red or black and excludes all the same color. The third term marks each piece as a king or not. That would be 170 GB at 2 bits each but I guess not all of the board positions are reachable. How do you exclude the unreachable positions? Or is my math wrong again?

**Kennon Conrad** · 10th November 2014, 05:46

Originally Posted by Matt Mahoney

I forgot to count kings in my last calculation. The total number of positions for 8 pieces should be (32 choose 8 ) * (2^8 - 2) * (2^8 ) = 683,941,939,200. The first term counts the number of occupied positions. The second term marks each as red or black and excludes all the same color. The third term marks each piece as a king or not. That would be 170 GB at 2 bits each but I guess not all of the board positions are reachable. How do you exclude the unreachable positions? Or is my math wrong again?

Your math is fine, but we haven't come up with the right algorithm yet. I think the main thing we are still missing is that each players pieces that are on the row furthest from the player must be kings. It seems like this is probably where the 120 1 piece combinations come from since (32 + 28) * 2 = 120.

With 8 pieces, an average of 1 piece would be on the back row, so the number of combinations would be half as much compared to our (flawed) baseline if the distribution of kings was even among the combinations and we only looked at the 8 piece combinations. But since the distribution of kings is not even and some combinations have fewer than 8 pieces, the impact from adding this restriction is somewhat less than a 50% reduction in combinations. It does appear to be in the ballpark of explaining the size difference between 170 GB and the 110 GB database.

Alvaro, I am now imagining you compress each database separately, which seems like a good way to go. It seems like a (custom) adaptive 2 bit-centric entropy encoder would give the best results, but maybe that would be too slow. Other than that, I still wonder if you can put the data in a more compressible order. Perhaps writing the data in ranked order would help, where the rank represents the average distance of a player's pieces from the player. I think the more you can get the W's and L's into groups, the more compressible your data will be.

**Alvaro** · 10th November 2014, 07:14

Many thanks for your replies.
Skymmer, I just got up during the nigh, at the end of the day after work I'll post my RLE compressed sizes for the 2 subdbs I uploaded.
Matt, please don't forget men don't occupy all 32 squares, they can only occupy 28 squares.
Only kings can occupy any of the 32 squares. Also I don't use the original perfect gapless indexing scheme of Jonathan Schaffer (of wich I lost the paper), instead I follow Martin Fierz's faster but with gaps indexing scheme in wich white men and black men positions can overlap, but that is detectable during the EGTB generation and skiped as illegal positions, this gives slightly more positions than are necessary but since their w/d/l values are set to the dominant win/draw/loss value of a given subdatabase there is very little overhead in size after compression, the reason is for a faster position2index() calculation during real time gameplay.
Here's my subdatabase size calculation function:

Code:

dbint getdatabasesize_p(int Nbk, int Nwk, int Nbm, int Nwm) {
    // needs binomial coefficients in the array bicoef[][] = choose from n, k
    dbint  dbsize = 1;

    // number of bm configurations:
    // there are bm black men subject to the constraint that one of them is on 
    // the square bmsquare

    if(Nbm) dbsize *= bicoef[28][Nbm];
 
    if(Nwm) dbsize *= bicoef[28][Nwm];

    // number of bk configurations
    if(Nbk) dbsize *= bicoef[32-Nbm-Nwm][Nbk];

    // number of wk configurations
    if(Nwk) dbsize *= bicoef[32-Nbm-Nwm-Nbk][Nwk];

    return dbsize;
}

Also here's a list of all subdatabases size's of 8 or less pieces:

Computing 1100 Size = 992 Size/4 = 248
Computing 1001 and it's mirror 0110 Size = 868 x 2 Size/4 = 217 x 2
Computing 0011 Size = 784 Size/4 = 196
Computing 2100 and it's mirror 1200 Size = 14,880 x 2 Size/4 = 3,720 x 2
Computing 2001 and it's mirror 0210 Size = 13,020 x 2 Size/4 = 3,255 x 2
Computing 1110 and it's mirror 1101 Size = 26,040 x 2 Size/4 = 6,510 x 2
Computing 1011 and it's mirror 0111 Size = 23,520 x 2 Size/4 = 5,880 x 2
Computing 0120 and it's mirror 1002 Size = 11,340 x 2 Size/4 = 2,835 x 2
Computing 0021 and it's mirror 0012 Size = 10,584 x 2 Size/4 = 2,646 x 2
Computing 3100 and it's mirror 1300 Size = 143,840 x 2 Size/4 = 35,960 x 2
Computing 3001 and it's mirror 0310 Size = 125,860 x 2 Size/4 = 31,465 x 2
Computing 2110 and it's mirror 1201 Size = 377,580 x 2 Size/4 = 94,395 x 2
Computing 2011 and it's mirror 0211 Size = 341,040 x 2 Size/4 = 85,260 x 2
Computing 1120 and it's mirror 1102 Size = 328,860 x 2 Size/4 = 82,215 x 2
Computing 1021 and it's mirror 0112 Size = 306,936 x 2 Size/4 = 76,734 x 2
Computing 0130 and it's mirror 1003 Size = 95,004 x 2 Size/4 = 23,751 x 2
Computing 0031 and it's mirror 0013 Size = 91,728 x 2 Size/4 = 22,932 x 2
Computing 2200 Size = 215,760 Size/4 = 53,940
Computing 2101 and it's mirror 1210 Size = 377,580 x 2 Size/4 = 94,395 x 2
Computing 2002 and it's mirror 0220 Size = 164,430 x 2 Size/4 = 41,107 x 2
Computing 1111 Size = 682,080 Size/4 = 170,520
Computing 1012 and it's mirror 0121 Size = 306,936 x 2 Size/4 = 76,734 x 2
Computing 0022 Size = 142,884 Size/4 = 35,721
Computing 4100 and it's mirror 1400 Size = 1,006,880 x 2 Size/4 = 251,720 x 2
Computing 4001 and it's mirror 0410 Size = 881,020 x 2 Size/4 = 220,255 x 2
Computing 3110 and it's mirror 1301 Size = 3,524,080 x 2 Size/4 = 881,020 x 2
Computing 3011 and it's mirror 0311 Size = 3,183,040 x 2 Size/4 = 795,760 x 2
Computing 2120 and it's mirror 1202 Size = 4,604,040 x 2 Size/4 = 1,151,010 x 2
Computing 2021 and it's mirror 0212 Size = 4,297,104 x 2 Size/4 = 1,074,276 x 2
Computing 1130 and it's mirror 1103 Size = 2,660,112 x 2 Size/4 = 665,028 x 2
Computing 1031 and it's mirror 0113 Size = 2,568,384 x 2 Size/4 = 642,096 x 2
Computing 0140 and it's mirror 1004 Size = 573,300 x 2 Size/4 = 143,325 x 2
Computing 0041 and it's mirror 0014 Size = 573,300 x 2 Size/4 = 143,325 x 2
Computing 3200 and it's mirror 2300 Size = 2,013,760 x 2 Size/4 = 503,440 x 2
Computing 3101 and it's mirror 1310 Size = 3,524,080 x 2 Size/4 = 881,020 x 2
Computing 3002 and it's mirror 0320 Size = 1,534,680 x 2 Size/4 = 383,670 x 2
Computing 2210 and it's mirror 2201 Size = 5,286,120 x 2 Size/4 = 1,321,530 x 2
Computing 2111 and it's mirror 1211 Size = 9,549,120 x 2 Size/4 = 2,387,280 x 2
Computing 2012 and it's mirror 0221 Size = 4,297,104 x 2 Size/4 = 1,074,276 x 2
Computing 1220 and it's mirror 2102 Size = 4,604,040 x 2 Size/4 = 1,151,010 x 2
Computing 1121 and it's mirror 1112 Size = 8,594,208 x 2 Size/4 = 2,148,552 x 2
Computing 1022 and it's mirror 0122 Size = 4,000,752 x 2 Size/4 = 1,000,188 x 2
Computing 0230 and it's mirror 2003 Size = 1,330,056 x 2 Size/4 = 332,514 x 2
Computing 0131 and it's mirror 1013 Size = 2,568,384 x 2 Size/4 = 642,096 x 2
Computing 0032 and it's mirror 0023 Size = 1,238,328 x 2 Size/4 = 309,582 x 2
Computing 5100 and it's mirror 1500 Size = 5,437,152 x 2 Size/4 = 1,359,288 x 2
Computing 5001 and it's mirror 0510 Size = 4,757,508 x 2 Size/4 = 1,189,377 x 2
Computing 4110 and it's mirror 1401 Size = 23,787,540 x 2 Size/4 = 5,946,885 x 2
Computing 4011 and it's mirror 0411 Size = 21,485,520 x 2 Size/4 = 5,371,380 x 2
Computing 3120 and it's mirror 1302 Size = 41,436,360 x 2 Size/4 = 10,359,090 x 2
Computing 3021 and it's mirror 0312 Size = 38,673,936 x 2 Size/4 = 9,668,484 x 2
Computing 2130 and it's mirror 1203 Size = 35,911,512 x 2 Size/4 = 8,977,878 x 2
Computing 2031 and it's mirror 0213 Size = 34,673,184 x 2 Size/4 = 8,668,296 x 2
Computing 1140 and it's mirror 1104 Size = 15,479,100 x 2 Size/4 = 3,869,775 x 2
Computing 1041 and it's mirror 0114 Size = 15,479,100 x 2 Size/4 = 3,869,775 x 2
Computing 0150 and it's mirror 1005 Size = 2,653,560 x 2 Size/4 = 663,390 x 2
Computing 0051 and it's mirror 0015 Size = 2,751,840 x 2 Size/4 = 687,960 x 2
Computing 4200 and it's mirror 2400 Size = 13,592,880 x 2 Size/4 = 3,398,220 x 2
Computing 4101 and it's mirror 1410 Size = 23,787,540 x 2 Size/4 = 5,946,885 x 2
Computing 4002 and it's mirror 0420 Size = 10,359,090 x 2 Size/4 = 2,589,772 x 2
Computing 3210 and it's mirror 2301 Size = 47,575,080 x 2 Size/4 = 11,893,770 x 2
Computing 3111 and it's mirror 1311 Size = 85,942,080 x 2 Size/4 = 21,485,520 x 2
Computing 3012 and it's mirror 0321 Size = 38,673,936 x 2 Size/4 = 9,668,484 x 2
Computing 2220 and it's mirror 2202 Size = 62,154,540 x 2 Size/4 = 15,538,635 x 2
Computing 2121 and it's mirror 1212 Size = 116,021,808 x 2 Size/4 = 29,005,452 x 2
Computing 2022 and it's mirror 0222 Size = 54,010,152 x 2 Size/4 = 13,502,538 x 2
Computing 1230 and it's mirror 2103 Size = 35,911,512 x 2 Size/4 = 8,977,878 x 2
Computing 1131 and it's mirror 1113 Size = 69,346,368 x 2 Size/4 = 17,336,592 x 2
Computing 1032 and it's mirror 0123 Size = 33,434,856 x 2 Size/4 = 8,358,714 x 2
Computing 0240 and it's mirror 2004 Size = 7,739,550 x 2 Size/4 = 1,934,887 x 2
Computing 0141 and it's mirror 1014 Size = 15,479,100 x 2 Size/4 = 3,869,775 x 2
Computing 0042 and it's mirror 0024 Size = 7,739,550 x 2 Size/4 = 1,934,887 x 2
Computing 3300 Size = 18,123,840 Size/4 = 4,530,960
Computing 3201 and it's mirror 2310 Size = 47,575,080 x 2 Size/4 = 11,893,770 x 2
Computing 3102 and it's mirror 1320 Size = 41,436,360 x 2 Size/4 = 10,359,090 x 2
Computing 3003 and it's mirror 0330 Size = 11,970,504 x 2 Size/4 = 2,992,626 x 2
Computing 2211 Size = 128,913,120 Size/4 = 32,228,280
Computing 2112 and it's mirror 1221 Size = 116,021,808 x 2 Size/4 = 29,005,452 x 2
Computing 2013 and it's mirror 0231 Size = 34,673,184 x 2 Size/4 = 8,668,296 x 2
Computing 1122 Size = 108,020,304 Size/4 = 27,005,076
Computing 1023 and it's mirror 0132 Size = 33,434,856 x 2 Size/4 = 8,358,714 x 2
Computing 0033 Size = 10,732,176 Size/4 = 2,683,044
Computing 6100 and it's mirror 1600 Size = 23,560,992 x 2 Size/4 = 5,890,248 x 2
Computing 6001 and it's mirror 0610 Size = 20,615,868 x 2 Size/4 = 5,153,967 x 2
Computing 5110 and it's mirror 1501 Size = 123,695,208 x 2 Size/4 = 30,923,802 x 2
Computing 5011 and it's mirror 0511 Size = 111,724,704 x 2 Size/4 = 27,931,176 x 2
Computing 4120 and it's mirror 1402 Size = 269,336,340 x 2 Size/4 = 67,334,085 x 2
Computing 4021 and it's mirror 0412 Size = 251,380,584 x 2 Size/4 = 62,845,146 x 2
Computing 3130 and it's mirror 1303 Size = 311,233,104 x 2 Size/4 = 77,808,276 x 2
Computing 3031 and it's mirror 0313 Size = 300,500,928 x 2 Size/4 = 75,125,232 x 2
Computing 2140 and it's mirror 1204 Size = 201,228,300 x 2 Size/4 = 50,307,075 x 2
Computing 2041 and it's mirror 0214 Size = 201,228,300 x 2 Size/4 = 50,307,075 x 2
Computing 1150 and it's mirror 1105 Size = 68,992,560 x 2 Size/4 = 17,248,140 x 2
Computing 1051 and it's mirror 0115 Size = 71,547,840 x 2 Size/4 = 17,886,960 x 2
Computing 0160 and it's mirror 1006 Size = 9,795,240 x 2 Size/4 = 2,448,810 x 2
Computing 0061 and it's mirror 0016 Size = 10,548,720 x 2 Size/4 = 2,637,180 x 2
Computing 5200 and it's mirror 2500 Size = 70,682,976 x 2 Size/4 = 17,670,744 x 2
Computing 5101 and it's mirror 1510 Size = 123,695,208 x 2 Size/4 = 30,923,802 x 2
Computing 5002 and it's mirror 0520 Size = 53,867,268 x 2 Size/4 = 13,466,817 x 2
Computing 4210 and it's mirror 2401 Size = 309,238,020 x 2 Size/4 = 77,309,505 x 2
Computing 4111 and it's mirror 1411 Size = 558,623,520 x 2 Size/4 = 139,655,880 x 2
Computing 4012 and it's mirror 0421 Size = 251,380,584 x 2 Size/4 = 62,845,146 x 2
Computing 3220 and it's mirror 2302 Size = 538,672,680 x 2 Size/4 = 134,668,170 x 2
Computing 3121 and it's mirror 1312 Size = 1,005,522,336 x 2 Size/4 = 251,380,584 x 2
Computing 3022 and it's mirror 0322 Size = 468,087,984 x 2 Size/4 = 117,021,996 x 2
Computing 2230 and it's mirror 2203 Size = 466,849,656 x 2 Size/4 = 116,712,414 x 2
Computing 2131 and it's mirror 1213 Size = 901,502,784 x 2 Size/4 = 225,375,696 x 2
Computing 2032 and it's mirror 0223 Size = 434,653,128 x 2 Size/4 = 108,663,282 x 2
Computing 1240 and it's mirror 2104 Size = 201,228,300 x 2 Size/4 = 50,307,075 x 2
Computing 1141 and it's mirror 1114 Size = 402,456,600 x 2 Size/4 = 100,614,150 x 2
Computing 1042 and it's mirror 0124 Size = 201,228,300 x 2 Size/4 = 50,307,075 x 2
Computing 0250 and it's mirror 2005 Size = 34,496,280 x 2 Size/4 = 8,624,070 x 2
Computing 0151 and it's mirror 1015 Size = 71,547,840 x 2 Size/4 = 17,886,960 x 2
Computing 0052 and it's mirror 0025 Size = 37,149,840 x 2 Size/4 = 9,287,460 x 2
Computing 4300 and it's mirror 3400 Size = 117,804,960 x 2 Size/4 = 29,451,240 x 2
Computing 4201 and it's mirror 2410 Size = 309,238,020 x 2 Size/4 = 77,309,505 x 2
Computing 4102 and it's mirror 1420 Size = 269,336,340 x 2 Size/4 = 67,334,085 x 2
Computing 4003 and it's mirror 0430 Size = 77,808,276 x 2 Size/4 = 19,452,069 x 2
Computing 3310 and it's mirror 3301 Size = 412,317,360 x 2 Size/4 = 103,079,340 x 2
Computing 3211 and it's mirror 2311 Size = 1,117,247,040 x 2 Size/4 = 279,311,760 x 2
Computing 3112 and it's mirror 1321 Size = 1,005,522,336 x 2 Size/4 = 251,380,584 x 2
Computing 3013 and it's mirror 0331 Size = 300,500,928 x 2 Size/4 = 75,125,232 x 2
Computing 2320 and it's mirror 3202 Size = 538,672,680 x 2 Size/4 = 134,668,170 x 2
Computing 2221 and it's mirror 2212 Size = 1,508,283,504 x 2 Size/4 = 377,070,876 x 2
Computing 2122 and it's mirror 1222 Size = 1,404,263,952 x 2 Size/4 = 351,065,988 x 2
Computing 2023 and it's mirror 0232 Size = 434,653,128 x 2 Size/4 = 108,663,282 x 2
Computing 1330 and it's mirror 3103 Size = 311,233,104 x 2 Size/4 = 77,808,276 x 2
Computing 1231 and it's mirror 2113 Size = 901,502,784 x 2 Size/4 = 225,375,696 x 2
Computing 1132 and it's mirror 1123 Size = 869,306,256 x 2 Size/4 = 217,326,564 x 2
Computing 1033 and it's mirror 0133 Size = 279,036,576 x 2 Size/4 = 69,759,144 x 2
Computing 0340 and it's mirror 3004 Size = 67,076,100 x 2 Size/4 = 16,769,025 x 2
Computing 0241 and it's mirror 2014 Size = 201,228,300 x 2 Size/4 = 50,307,075 x 2
Computing 0142 and it's mirror 1024 Size = 201,228,300 x 2 Size/4 = 50,307,075 x 2
Computing 0043 and it's mirror 0034 Size = 67,076,100 x 2 Size/4 = 16,769,025 x 2
Computing 7100 and it's mirror 1700 Size = 84,146,400 x 2 Size/4 = 21,036,600 x 2
Computing 7001 and it's mirror 0710 Size = 73,628,100 x 2 Size/4 = 18,407,025 x 2
Computing 6110 and it's mirror 1601 Size = 515,396,700 x 2 Size/4 = 128,849,175 x 2
Computing 6011 and it's mirror 0611 Size = 465,519,600 x 2 Size/4 = 116,379,900 x 2
Computing 5120 and it's mirror 1502 Size = 1,346,681,700 x 2 Size/4 = 336,670,425 x 2
Computing 5021 and it's mirror 0512 Size = 1,256,902,920 x 2 Size/4 = 314,225,730 x 2
Computing 4130 and it's mirror 1403 Size = 1,945,206,900 x 2 Size/4 = 486,301,725 x 2
Computing 4031 and it's mirror 0413 Size = 1,878,130,800 x 2 Size/4 = 469,532,700 x 2
Computing 3140 and it's mirror 1304 Size = 1,676,902,500 x 2 Size/4 = 419,225,625 x 2
Computing 3041 and it's mirror 0314 Size = 1,676,902,500 x 2 Size/4 = 419,225,625 x 2
Computing 2150 and it's mirror 1205 Size = 862,407,000 x 2 Size/4 = 215,601,750 x 2
Computing 2051 and it's mirror 0215 Size = 894,348,000 x 2 Size/4 = 223,587,000 x 2
Computing 1160 and it's mirror 1106 Size = 244,881,000 x 2 Size/4 = 61,220,250 x 2
Computing 1061 and it's mirror 0116 Size = 263,718,000 x 2 Size/4 = 65,929,500 x 2
Computing 0170 and it's mirror 1007 Size = 29,601,000 x 2 Size/4 = 7,400,250 x 2
Computing 0071 and it's mirror 0017 Size = 33,153,120 x 2 Size/4 = 8,288,280 x 2
Computing 6200 and it's mirror 2600 Size = 294,512,400 x 2 Size/4 = 73,628,100 x 2
Computing 6101 and it's mirror 1610 Size = 515,396,700 x 2 Size/4 = 128,849,175 x 2
Computing 6002 and it's mirror 0620 Size = 224,446,950 x 2 Size/4 = 56,111,737 x 2
Computing 5210 and it's mirror 2501 Size = 1,546,190,100 x 2 Size/4 = 386,547,525 x 2
Computing 5111 and it's mirror 1511 Size = 2,793,117,600 x 2 Size/4 = 698,279,400 x 2
Computing 5012 and it's mirror 0521 Size = 1,256,902,920 x 2 Size/4 = 314,225,730 x 2
Computing 4220 and it's mirror 2402 Size = 3,366,704,250 x 2 Size/4 = 841,676,062 x 2
Computing 4121 and it's mirror 1412 Size = 6,284,514,600 x 2 Size/4 = 1,571,128,650 x 2
Computing 4022 and it's mirror 0422 Size = 2,925,549,900 x 2 Size/4 = 731,387,475 x 2
Computing 3230 and it's mirror 2303 Size = 3,890,413,800 x 2 Size/4 = 972,603,450 x 2
Computing 3131 and it's mirror 1313 Size = 7,512,523,200 x 2 Size/4 = 1,878,130,800 x 2
Computing 3032 and it's mirror 0323 Size = 3,622,109,400 x 2 Size/4 = 905,527,350 x 2
Computing 2240 and it's mirror 2204 Size = 2,515,353,750 x 2 Size/4 = 628,838,437 x 2
Computing 2141 and it's mirror 1214 Size = 5,030,707,500 x 2 Size/4 = 1,257,676,875 x 2
Computing 2042 and it's mirror 0224 Size = 2,515,353,750 x 2 Size/4 = 628,838,437 x 2
Computing 1250 and it's mirror 2105 Size = 862,407,000 x 2 Size/4 = 215,601,750 x 2
Computing 1151 and it's mirror 1115 Size = 1,788,696,000 x 2 Size/4 = 447,174,000 x 2
Computing 1052 and it's mirror 0125 Size = 928,746,000 x 2 Size/4 = 232,186,500 x 2
Computing 0260 and it's mirror 2006 Size = 122,440,500 x 2 Size/4 = 30,610,125 x 2
Computing 0161 and it's mirror 1016 Size = 263,718,000 x 2 Size/4 = 65,929,500 x 2
Computing 0062 and it's mirror 0026 Size = 142,407,720 x 2 Size/4 = 35,601,930 x 2
Computing 5300 and it's mirror 3500 Size = 589,024,800 x 2 Size/4 = 147,256,200 x 2
Computing 5201 and it's mirror 2510 Size = 1,546,190,100 x 2 Size/4 = 386,547,525 x 2
Computing 5102 and it's mirror 1520 Size = 1,346,681,700 x 2 Size/4 = 336,670,425 x 2
Computing 5003 and it's mirror 0530 Size = 389,041,380 x 2 Size/4 = 97,260,345 x 2
Computing 4310 and it's mirror 3401 Size = 2,576,983,500 x 2 Size/4 = 644,245,875 x 2
Computing 4211 and it's mirror 2411 Size = 6,982,794,000 x 2 Size/4 = 1,745,698,500 x 2
Computing 4112 and it's mirror 1421 Size = 6,284,514,600 x 2 Size/4 = 1,571,128,650 x 2
Computing 4013 and it's mirror 0431 Size = 1,878,130,800 x 2 Size/4 = 469,532,700 x 2
Computing 3320 and it's mirror 3302 Size = 4,488,939,000 x 2 Size/4 = 1,122,234,750 x 2
Computing 3221 and it's mirror 2312 Size = 12,569,029,200 x 2 Size/4 = 3,142,257,300 x 2
Computing 3122 and it's mirror 1322 Size = 11,702,199,600 x 2 Size/4 = 2,925,549,900 x 2
Computing 3023 and it's mirror 0332 Size = 3,622,109,400 x 2 Size/4 = 905,527,350 x 2
Computing 2330 and it's mirror 3203 Size = 3,890,413,800 x 2 Size/4 = 972,603,450 x 2
Computing 2231 and it's mirror 2213 Size = 11,268,784,800 x 2 Size/4 = 2,817,196,200 x 2
Computing 2132 and it's mirror 1223 Size = 10,866,328,200 x 2 Size/4 = 2,716,582,050 x 2
Computing 2033 and it's mirror 0233 Size = 3,487,957,200 x 2 Size/4 = 871,989,300 x 2
Computing 1340 and it's mirror 3104 Size = 1,676,902,500 x 2 Size/4 = 419,225,625 x 2
Computing 1241 and it's mirror 2114 Size = 5,030,707,500 x 2 Size/4 = 1,257,676,875 x 2
Computing 1142 and it's mirror 1124 Size = 5,030,707,500 x 2 Size/4 = 1,257,676,875 x 2
Computing 1043 and it's mirror 0134 Size = 1,676,902,500 x 2 Size/4 = 419,225,625 x 2
Computing 0350 and it's mirror 3005 Size = 287,469,000 x 2 Size/4 = 71,867,250 x 2
Computing 0251 and it's mirror 2015 Size = 894,348,000 x 2 Size/4 = 223,587,000 x 2
Computing 0152 and it's mirror 1025 Size = 928,746,000 x 2 Size/4 = 232,186,500 x 2
Computing 0053 and it's mirror 0035 Size = 321,965,280 x 2 Size/4 = 80,491,320 x 2
Computing 4400 Size = 736,281,000 Size/4 = 184,070,250
Computing 4301 and it's mirror 3410 Size = 2,576,983,500 x 2 Size/4 = 644,245,875 x 2
Computing 4202 and it's mirror 2420 Size = 3,366,704,250 x 2 Size/4 = 841,676,062 x 2
Computing 4103 and it's mirror 1430 Size = 1,945,206,900 x 2 Size/4 = 486,301,725 x 2
Computing 4004 and it's mirror 0440 Size = 419,225,625 x 2 Size/4 = 104,806,406 x 2
Computing 3311 Size = 9,310,392,000 Size/4 = 2,327,598,000
Computing 3212 and it's mirror 2321 Size = 12,569,029,200 x 2 Size/4 = 3,142,257,300 x 2
Computing 3113 and it's mirror 1331 Size = 7,512,523,200 x 2 Size/4 = 1,878,130,800 x 2
Computing 3014 and it's mirror 0341 Size = 1,676,902,500 x 2 Size/4 = 419,225,625 x 2
Computing 2222 Size = 17,553,299,400 Size/4 = 4,388,324,850
Computing 2123 and it's mirror 1232 Size = 10,866,328,200 x 2 Size/4 = 2,716,582,050 x 2
Computing 2024 and it's mirror 0242 Size = 2,515,353,750 x 2 Size/4 = 628,838,437 x 2
Computing 1133 Size = 6,975,914,400 Size/4 = 1,743,978,600
Computing 1034 and it's mirror 0143 Size = 1,676,902,500 x 2 Size/4 = 419,225,625 x 2
Computing 0044 Size = 419,225,625 Size/4 = 104,806,406
number_files=406
total_database_size/4=121,733,463,284 (113GB)

These are the subdatabases created during a week of computation, at the end I compress each one of these 406 files and concatenate them into a single big file plus 2 small index files.

Anyway this thread is shifting to slightly different discussion.
My main goal is to find a better compression scheme than RLE.

best regards,
Alvaro

**Matt Mahoney** · 10th November 2014, 16:46

Yeah I forgot you can't have men on the forward row because they become kings. Anyway you could experiment with reordering the mapping of bits to board positions to see if this gives you better groupings of w/d/l. I imagine front-back ordering would be best but maybe a zigzag order instead of a raster scan or grouping center or side positions first. I guess only experiments could answer the question. Also I would suggest LZ77, which reduces to RLE when the offset is 1.

**biject.bwts** · 10th November 2014, 18:19

I like the concept of what you are doing.
I read a long time ago how the original checkers was broken. I could be wrong but I think the guy did at first something like you are doing where all the possible endings for when only a few pieces exist. Then they later did a deep alpha bets search with cutoffs from the start of gave to where an endings known this essential allowed for full search without actually having to waste time with every possible move.

I still think you could do something similar by first assuming that first player wins. Build the ends for only WIN LOSS for the player that moves first call a DRAW a LOSS for first player which is a win for second, so only 1 bit needed for this instead of 2. This would make the alpha bets search and endings mush faster to code and run,

At the end of this phase if the first player WINS in the alpha bets search its over. If the methods ends in a LOSS then you have to go to a second phase. To see if the second player has the forced WIN or if its a DRAW. Do as first phase except now a DRAW counts as a LOSS to second player. If at the end of alpha beta search the second players leads to a WIN its over the second player has the WIN if it leads to a LOSS then since the end of the two phases you know the best methods lead only to a DRAW. Just a thought.

**Skymmer** · 12th November 2014, 02:10

Found a scheme which compresses 2222.db to 61 931 010 bytes (4 388 324 850 -> 61 931 010)
Both compression and decompression are multi-threaded and the rates are:
Comp: 231 MB\s
Decomp: 524 MB\s
Peak memory usage is about 1.1 GB. Measured on i7-2700@4800MHz.
No details at this moment cause I don't sure if it will work nice for all chunks and there are some additional ideas I need to check. It would be very nice if you can provide some more large samples for testing.

**Alvaro** · 12th November 2014, 13:34

Mat:
thx could you please point me to some source code for LZ77?
I need to identify and jump directly to a block in the compressed subdb, skiping all data from the beginning. Decompressing from the start of the subdb would make my engine a woodpusher. Is LZ77 good for that?
Also I'm not too inclined to change the mapping of the entries, since that would imply a new adaptable position2index() function wich already is too complex and costly to compute.

biject.bwts
thx, your ideas are revolucionary, but I don't see how that would work on an alphabeta framework.
Although it is true some have experimented with single bit databases, wich would only tell Win/NotWin, this scheme is not a good one, suppose the engine is in a sure loosing line, so it will need desperately to find a draw, wich such database won't tell, so if the opponent makes a drawing move by mistake the engine won't find the draw because a NotWin is either a loss or a draw, in this case the engine could go for a loosing line again, also the node count after each search using these single bit databases would allways be higher, since it would provide less cutoff/pruning oportunities, wich translates to more search time and less playing strength.

Skimmer:
thx for the input, my RLE encodes 2222.db to 117MB and 4400.db to 1.04Mb using a blocksize of 2048 bytes (uncompressed) and you are right I must provide other samples.
I'll try to find a less compressible subdatabase to have an idea of the worst case.
Also since probing speed is highly important I split a subdatabase in blocksizes of 2048 bytes (uncompressed), wich I compress independently, originally I even used a blocksize of 512 bytes to use the less time possible when decompressing.
I have 2 index files that tell me where each compressed block is in a given compressed subdatabase so I can avoid decompress from the start of the subdb wich would kill the performance of my checkers engine.
So I'm not sure how well dictionary compressors will do by compressing such small block sizes independently.
Maybe it's possible to make a first pass on the entire subdb to "train" the compressor and then on the second pass compress each block one at a time. I think this is called 'static dictionary' compressors since the compressed data doesn't change and is very large.
About multi-threading, I don't want a multi-treaded decompressor because my engine is already multi-threaded, so each thread is already busy not only probing the database but doing other sutff as well, offcourse the decompressor must be thread safe.

BTW my engine is already doing around 10million nodes per second on a core i7 930 using 4 threads, in posistions near the 8piece database the RLE decompression algorithm starts hurting a bit and drops to NPS to around 7-8million nodes per second and using 512 bytes of block size (uncompresed), I really wouldn't like to go much lower than that, because that would hurt playing strength.

many thanks to you all,
Alvaro

**Matt Mahoney** · 12th November 2014, 17:16

There are lots of open source LZ77 programs. zlib (deflate, zip, gzip) is probably the most commonly used. But I meant you should probably write your own because your data is w/d/l symbols instead of bytes. It's more important to understand how the algorithm works.

**Kennon Conrad** · 12th November 2014, 19:41

Alvaro, I think it may not be possible to do better than RLE given the speed constraints that prevent more complicated indexing or slower access to data from the compressed file. I am curious about the code you posted though:

Code:

    if(Nbm) dbsize *= bicoef[28][Nbm];
 
    if(Nwm) dbsize *= bicoef[28][Nwm];

    // number of bk configurations
    if(Nbk) dbsize *= bicoef[32-Nbm-Nwm][Nbk];

    // number of wk configurations
    if(Nwk) dbsize *= bicoef[32-Nbm-Nwm-Nbk][Nwk];

When you multiply the database size for the white men, wouldn't it be possible to exclude a good portion of the database for the black men that occupy spaces the white men could potentially be placed on (like you do for the kings)? Maybe that would make it hard to produce the database, but it seems like the database would be a quite a bit smaller if you can typically eliminate even a few spaces from consideration.

**biject.bwts** · 12th November 2014, 20:15

Originally Posted by Alvaro

biject.bwts
thx, your ideas are revolucionary, but I don't see how that would work on an alphabeta framework.
Although it is true some have experimented with single bit databases, wich would only tell Win/NotWin, this scheme is not a good one, suppose the engine is in a sure loosing line, so it will need desperately to find a draw, wich such database won't tell, so if the opponent makes a drawing move by mistake the engine won't find the draw because a NotWin is either a loss or a draw, in this case the engine could go for a loosing line again, also the node count after each search using these single bit databases would allways be higher, since it would provide less cutoff/pruning oportunities, wich translates to more search time and less playing strength.

Alvaro

I am sorry my thoughts were not about building a machine to play the game. But designing and running a program to see if the ultimate checker game of your type is WIN LOSS or DRAW for the first or second player.

The Idea was to concentrate on 2 different games. One game where a DRAW counts as a LOSS to first player but as a WIN for the second player. That is surely an easier game to analyze than the 3 state outcome. And if the first player always WINS with perfect play no need to go the second version of the game. Searching and cutoff/pruning opportunities would be more frequent and faster to run when to the depth to find final outcome.
Of course you could have two versions of you end game one for this or one for the next game. You could still use you current data base for positions that reach it but you at that point interpret the DRAW correctly for the game subset being played.

Again this is only to see what this form of checkers ultimately is. If the first player game reduces to the FIRST player not winning then you do a full search of the next modified version of the game where DRAW is a WIN for the first player.

Here is a table from first players point of view

Your GAME outcome either WIN DRAW or LOSS

GAME A outcome either WIN or LOSS no DRAW since win for second player
GAME B outcome either WIN or LOSS no DRAW since win for first playes

if YOUR GAME with perfect play is a WIN
GAME A with perfect play is a WIN, GAME B with perfect play a WIN

if YOUR GAME with perfect play is a DRAW
GAME A with perfect play is a LOSS, GAME B with perfect play a WIN

if you GAME with perfect play is a LOSS
GAME A with perfect play a LOSS. GAME B with perfect play a LOSS

No other outcome is possible

I do see that if your goal is not be the first to find out what the outcome of playing perfectly this form of checkers to see if its a WIN LOSS or DRAW for first player then my ideas may be of little use.

But here is another idea you might like. In the game of chess there are some very complicate endings where one side should have a win. But people have programmed it so if the machine is losing. It always makes the move so the opponents has the most moves left to make. Often even the experts can't see the correct way to force the win. So the machine gets a draw when a certain number of moves done with out a capture or pawn move. So you might be able to pull out draw in this game when in a position where the machine should have lost.

**Alvaro** · 17th November 2014, 00:40

Skymmer

here are 2 new samples:
https://drive.google.com/file/d/0Bya...ew?usp=sharing

The RLE compressor gives:
1133.db 107Mb (1.62GB uncompressed size)
1034.db 30.07Mb (399Mb uncompressed, amazing, winrar only compresses this one to 28Mb)

Kennon Conrad
You are right RLE is very fast, but someone already did it using Huffman fixed length codes on top of RLE, then more than halved their database size (The Chinook team), a pitty don't understand standard Huffman in the first place.
You are also right about the subdatabase size calculation having more entries than necessary. But the difference isn't big, and since those illegal positions are set to the current run (rle) value they almost vanish after compression. The problem isn't the sizes of the original subdatabases but the simple (although efficient) RLE algorithm I use, it just isn't enough.

biject.bwts
thanks for ideas, I will have to make a full study on them in the future, since they are quite different from what I'm accustomed to.
About that last idea of making a move that delays the maximum possible the loss, I already have that, it's those DTM (8bit) databases I mentioned above that provide that, they give the exact number of moves to the final win/loss position. So for winning they give the shortest possible number of moves, for losing the give the longest possible number of moves. But they are 4 times bigger and much more uncompressible. So I use the 2bit ones to try to go for a winning line first, and once the engine is in a winning line then I can access the much slower and bigger DTM bases on disk.

thanks to all,
Alvaro

**Kennon Conrad** · 17th November 2014, 02:53

Originally Posted by Alvaro

You are right RLE is very fast, but someone already did it using Huffman fixed length codes on top of RLE, then more than halved their database size (The Chinook team), a pitty don't understand standard Huffman in the first place.

How do you write the run lengths? Huffman coding is a method for creating optimal integer length codes. For instance, if half of your runs are length 1, one quarter length 2, and the last quarter length 3, then Huffman coding would give a set of codes with unique prefixes that are one bit long for runs of length 1 and two bits long for runs of length 2 and 3 (0, 10 and 11 for example). Obviously you have more than runs of length 1, 2 and 3, but the idea is to create optimal codes based on the various probabilities of the run lengths. Do you know the distribution of the run lengths that are coded?

Something like this may work for you. The exact codes would depend on the distribution of run lengths.

00: run length 1
010: run length 2
011: run length 3
1000: run length 4
1001: run length 5
1010: run length 6
10110: run length 7
10111: run length 8
10110: run length 9
10111: run length 10
110000: run length 11
110001: run length 12
110010: run length 13
110011: run length 14
110100: run length 15
110101: run length 16
1101100: run length 17
etc.

**Alvaro** · 18th November 2014, 22:40

Thanks, the RLE algorithm is an enhancement of Martin Fierz from the one from Jonathan Schaeffer.
Each byte in the database is either the w/l/d values of four consecutive positions, or it is a runlength token.
In a runlength token it isn't saved the actual length, wich would be very limited, but instead an index to a vector that holds the actual lengths.
int lengths[] = { 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 36, 40, 44, 48, 52, 56, 60, 70, 80, 90, 100, 150, 200, 256, 300, 400, 512, 650, 800, 1024, 1200, 1400, 1600, 2048, 2400, 3200, 4096, 6000, 8192}; you get the picture
So I can have very big run lengths encoded in a single byte, plus the win/draw/loss value, this is why it is so efficient.

Alvaro

**Kennon Conrad** · 19th November 2014, 06:59

Thanks for the details. What you show looks like it would be very fast, but not provide optimal compression.

One concern I would have is the spread of the indexed run lengths. I don't see much advantage to having "90" for instance. This would only help for run lengths 90 - 93. If the run length was 94 - 99, it could be encoded equally effectively using the run length of 90 followed by a run length of 4 -9, a run length of 80 followed by a run length of 14 - 19, or a run length of 70 followed by a run length of 24 - 29. Whenever there is more than one way to write the same information in the same number of bits, that is a sign the scheme is not optimal since all but one of those ways of writing the data could be repurposed to represent something that can not currently be written with that number of bits.

If you are writing a run length that will take more than one byte to write and start with the largest run length possible in the first byte, all the run lengths that are longer and of the same result type are useless when writing the rest of the run. These could be repurposed to mean run length of N plus a (short) run length of a different result. Whenever one of those was repurposed and was useful, you may well be able to write another long run on the very next byte.

If you just wrote a run of length 5 - 31, the next result is not the same, assuming the code wrote the longest run length possible. This makes 84 of the 252 possible codes for the next byte useless (27 four result codes plus the 57 codes you listed corresponding to the same result as the previous run length). You could repurpose these to represent run lengths or combinations of run lengths you currently don't have available.

If you are willing to give up the byte alignment, then you could definitely produce smaller compressed files with Huffman coding. For instance, it currently takes 3 bytes to write a d followed by 99 w's (4 byte dwww, run length 90 w, run length 6 w). If you used Huffman coding and worked on individual w/l/d values, you would only need to have two of the three w/l/d results possible for each run length (except the first one and whereever you want to start decoding in the middle of the file), using the (probably not optimal) scheme I showed, you could write this as d/00 w/12 bit code for run length 99 giving 16 bits total instead of the 24 bits with your current scheme. Since you are achieving better than 25:1 compression with your current scheme, the average run length is well over 25, so it is important to have a way to write long runs as efficiently as possible. I have a feeling that it is also important to be able to write a single exception within a run efficiently. Your current scheme consumes a byte for each exception, although you do include a bit more information within that byte. This could be as little as two or three bits.

In summary, I think your scheme is pretty efficient, but also think the efficiency could be even better.

**Alvaro** · 5th December 2014, 15:33

thanks for the input,
I exchanged some thoughts with another checker's engine programmer and he advised me to take a more statistical aproach and adapting the RLE algorithm to each subdb. So instead of general coder/decoder compression tables, I'll generate some statistcal info for each subdb and create compression tables adapted to eac subdb.
So I decided to change the RLE algorithm a bit accordingly to he's ideas.
I'll reserve some byte codes for patterns that occur more frequently and less for the simple run-length scheme.
So the algorithm will be RLE plus some kind of dictionary (or pattern substitution) compression.
Since during decoding I want to read a byte at a time for speedy decompression I have 256 codes for compression/decompression.
So I need a pattern finding algorithm to list every prossible pattern in the subdb (I'll have to adapt the algo to read 2bit tokens instead of whole chars).
But I want to limit the pattern length between 4 and 7 (of 2bit tokens), so I want to ignore the rest of the patterns.
Patterns with 4 2bit entries won't give any compression, but they are need, however pattern for 5,6 and 7 will offer cmpression for dense data patterns, the rest is pure RLE.
Next I need to find the patterns that most frequently occur in the subdb.
Could you please suggest me an algorithm for making a list of patterns in a file?
Also another algorithm for ranking those patterns by the most frequent to the less frequent?
Or is there an algorithm that does this all of this?

best regards,
Alvaro

Thread: EGTB compression

Thread Tools

Search Thread

Display

EGTB compression

Posting Permissions