Friday, December 18, 2020

Loading RLE-compressed CSV tile maps into RAM on Genesis: SGDK, Tiled, and Python

 If you haven't used Tiled, I strongly recommend it. It's a platform agnostic tile based mapping tool with a number of nice features. A particularly nice one is CSV export - a format that can be very quickly converted into byte code or something else.

I won't cover how to make or organize a tile map here - that is something covered in detail in a number of places. Skipping ahead, let's assume you've made a simple map something like this:



In our case, we want it as an .h file we can include in our SGDK project. After exporting the map as a CSV, we get the following data, or something like it:


The entirety of the python script I wrote to convert the CSV to a .h file is here (explanation below):
# csv2c.py
import sys
if(len(sys.argv) != 2): # 1 arg only
    print('error')
    sys.exit()
f = open(sys.argv[1], 'r')
csvtxt = f.read()
f.close()
lines = csvtxt.split('\n')
i = 0
csvtxt = '' # split at line break bytes
while i < len(lines)-1:
    csvtxt = csvtxt + lines[i] + ',\n' 
    i += 1
i = 0
chars = [] # make output array
while i < len(lines):
    b = lines[i].split(',')
    j = 0
    while j < len(b):
        if (b[j]) != '':
            chars.append(b[j])
        j += 1
    i += 1
wid = int(len(chars)**0.5)
i = 0
outchars=[] 
outchars.append(wid) # compress w RLE
while i < len(chars):
    if (i < len(chars)-2):
        if (chars[i] == chars[i+1]) and (chars[i] == chars[i+2]):
            mc = chars[i]
            outchars.append('254') #0xfe
            count = 0
            while (mc == chars[i]) and (i < len(chars)-2):
                count += 1
                i += 1
            i = i - 1
            outchars.append(str(mc)) #char
            outchars.append(str(count)) #times
        else:
            outchars.append(chars[i])
    else:
        outchars.append(chars[i])
    i += 1
outchars.append(len(outchars)-1) # end compression scheme
convt = '\t'
fn = sys.argv[1].split('/')
i = 0
while i < len(outchars): # tabulate str
    convt = convt + str(outchars[i]) + ','
    if ((i+1)%16==0):
        convt = convt + '\n\t'
    i += 1
outstr = "#ifndef HEADERSET\n#include <genesis.h>\n#endif\n//map[0] = mapWidth, map[..n] = size of array -2\nconst u8 " + fn[len(fn)-1][:-4] + "["+str(len(outchars))+"] = {\n" + convt + "\n};"
f = open(sys.argv[1][:-4]+'.h', 'w')
f.write(outstr)
f.close()
print(str(len(chars)) + ' written')

The string manipulation is fairly standard, so I'll explain the compression scheme:

wid = int(len(chars)**0.5)
i = 0
outchars=[] 
outchars.append(wid) # compress w RLE

This first bit takes the square root of the number of tiles, 256 for a 16 by 16 map, and adds that number (i.e. sqrt(256) = 16) as the first digit in the array. This is optional, it's so my game can know how big the map is for other code.

while i < len(chars):
    if (i < len(chars)-2):
        if (chars[i] == chars[i+1]) and (chars[i] == chars[i+2]):

This iteration looks confusing. My RLE compression looks like this: 
1. If the byte is NOT 0xFE, copy through
2. If the byte IS 0xFE:
    - The next byte is the value to copy 
    - The following byte is how many times to copy it
    - Move to next byte, back to 1.

That means the following are true:
"1 1 1 1 2" = 0xFE 0x01 0x04 0x02
"2 2 2 1 1" = 0xFE 0x02 0x03 0x01 0x01

So when compressing our raw map, we need to look ahead a minimum of two bytes to see if a 0xFE scheme is needed.

            mc = chars[i]
            outchars.append('254') #0xfe
            count = 0
            while (mc == chars[i]) and (i < len(chars)-2):
                count += 1
                i += 1
            i = i - 1

Next we search ahead, increasing our 'i' iterator as we do, and count the number of times we see the same number. The tricky part is the 'i = i - 1' at the end due to the post-increment. Python will still fall through to the final 'i = i + 1', and if we don't decrement i by one, we'll accidentally skip the next byte in the map. 

            outchars.append(str(mc)) #char
            outchars.append(str(count)) #times
        else:
            outchars.append(chars[i])
    else:
        outchars.append(chars[i])
    i += 1
outchars.append(len(outchars)-1) # end compression scheme

We need else's in both nests of the for loop to ensure all characters are written. It looks awkward but it is correct - though there might be a prettier way of doing it! 
Finally, we append the number of COMPRESSED characters to the end of the array - this is so my game has the length of the decompression routine ready to go. 

The script outputs a .h file that looks like this:

HEADERSET is simply what I call my project's shared header set. Without <genesis.h> it will complain about u8, but it's not a big deal. 

Note that the array is a const. SGDK/GCC should put this in ROM without prodding. Finally, the C:

u8 map[256];
u8 mapWidth;

void LoadDungeonMap(const u8* dungeon){
    u16 mapctr = 0;
    mapWidth = *(dungeon + 0);
    u16 dlen = *(dungeon + sizeof(dungeon));
    for(u16 i = 1; i < dlen; i++)
    {
        u8 nb = *(dungeon + i);
        if(nb == 0xfe){
            u8 ct = *(dungeon + i + 2);
            nb = *(dungeon + i + 1);
            for(u8 j = 0; j < ct; j++){
                map[mapctr++] = nb;
            }
            i = i + 2;
        }
        else {
            map[mapctr++] = nb;   
        }
    }
}

To call it:

int main(u16 hard){
    (...)
    LoadDungeonMap(&maptest2);
}

Note that when calling, we use & to pass the address of the first entry of the const array. When defining the function, we use * to indicate that the variable we pass is a pointer, which contains an address

Then to access the variable contained within the pointer's address, we use the * operator on the address of dungeon, offset by + i to get the byte value. Then we perform our RLE decompression, writing bytes to map[] (declared global to not be optimized away).

There you have it. No memcpy or extra allocation, just nice, straightforward RLE decompression for loading maps!