I am currently writing a game that is a real-time 2d sandbox action/rpg with rogue-like elements (See '
Euthenia' on YouTube). The intent of this post is to document a successful compression technique I was able to implement to store a 10,000 x 2,500 block world (or any other size) into a file that then stores the world for later play, or prepares to send it across the network (for multiplayer servers).
each block consists of 4 physical bytes. the first 2 bytes are signed and the last two are normal bytes.
before any compression techniques, writing this world to disk resulted in a 93 Megabyte file. (And took approximately 20 seconds to write) Sending that much data across a network would take at least a few minutes, if not, longer. Applying a zipping algorithm through 7-zip brought the file size down to around 300K. Using this as a benchmark, I knew that creating any algorithm to store this would have to give me similar results.
the nature of such a 2d world is that many horizontally-adjacent blocks will be of the same type. This is where the compression algorithm could come in.
the 4 bytes per block I was writing was not fully utilized:
the 3rd byte was only using the first 2 bits (leaving 6 unused bits)
the 4th byte was using the first 7 bits. It has a max in-game value of 100 (leaving another unused bit)
these 7 unused bits could be used to store a series of blocks in a row. (127 blocks)
but because of the possibility of thousands of blocks in a row being the same (e.g. empty sky tiles)
I opted for a different method of bit utilization.
byte 4: the 8th bit is simply used as a boolean flag. When on, this says there are identical blocks after me.
byte 3: bits 3 to 7 are used as block counters, and can store a value up to 31 (meaning 31 blocks in a row)
byte 3: bit 8 is also used as a boolean flag. When on, there is a byte written after this block that also details a number of blocks in a row, except * 32 (because 31 blocks can be held in bits 3 to 7 of byte 3)
This allows me to store 31 blocks together without any additional space requirements, and I can store up to 8191 blocks (255 * 32 + 31) utilizing an additional byte.
This method of compression resulted in the same world being written to a 272KB file. So I am overall quite pleased with it (0.3% of the original filesize!) And a nice side-effect is saving the world back to a file (and reading it in) now takes less than a second.
Of course a world file will tend to creep up in size as people build objects and change the world through playing, but it is not very sizable. Someone creating a "checkerboard pattern" (this algorithm's worst enemy) of 32 x 32 blocks would result in a file size increase of merely 4KB.
Reading the world into memory is just a matter of running the algorithm the other way. How have you dealt with reading a large world in[to] a file?