Friday, May 15, 2020

Love2D - Simple event stacks with Lua

Love2D and its 3D/VR companion LOVR are great. I won't blab about how awesome they are - though having an entirely open framework means certain things must be built from scratch. One such thing is an event handling system.

Engines like Unity use a class inheritance to handle this. Every object in a scene is a GameObject, which has an inherited update method to process itself every frame.

It's possible to do this without much trouble by architecting all of your game entities in a similarly OOP way, but this isn't always intuitive, and can cause unnecessary headache and overhead if your game isn't overly complex, or you want more manual control over your event stacks.

Here's a super simple event stack example using anonymous functions and an event stack table (named 'queue'):

table.insert(queue, function() <code> end)

The most frequent use would likely be to add a global wait in between code blocks:

table.insert(queue, function() wait = 1 end)

This also makes calling functions with parameters and so forth very simple:

table.insert(queue, function() 
        ComplexFunction(a, 'b', { c = 0 }) 

Then in update:

    if wait > 0 then
        wait = wait - TimeDelta
        love.draw() -- Continue to draw, but don't process stack
    if #queue > 0 then
        if type(queue[1])=='function' then
            local f = queue[1]
            table.remove(queue, 1)

This code is the basis for most of the animation in my game, or when there needs to be a timed wait e.g. to suspend input tracked by variable named inputEnabled for one second:

function q(o) table.insert(queue, o) end
function setinput(tf) inputEnabled = tf end
q(function() setinput(false) end)
q(function() wait = 1 end)
q(function() setinput(true) end)

Lua allows lots of room for freedom in styling your code however you wish.

Sunday, March 8, 2020

Multi-cart data storage on Pico-8

If you've played with Lexaloffle's Pico-8 for a little while, the limitations of the cart storage - not for graphics or sound, but for code and raw data (esp. tokens) - become a bottleneck very quickly.

Multiple cart support has been added to emulate a form of bank-switching, but it is implemented in a way that purposefully blocks your ability to write more code. The memory locations 0x4300 to around 0x6000 cannot be READ or WRITTEN - this is fairly illogical, because memory locations that cannot be either read or written can't really exist. 

You can, however, repurpose cartridge data to store byte data you create - you just have to know how to store it. The data in the cartridge is effectively hex strings in a specific order. Knowing this, we can write a quick tool to convert data we want to store into Pico-8's cartridge text format. 

We can then read it into the fairly large "user data" area of RAM at 0x4300 (in cartridge, this contains our code) and use it as we will. Loading takes a second, so you probably want to load in as much data as you can at once (i.e. entire towns, etc).

You can programatically store all sorts of data, and use your original cart as a sort of kernel. It will certainly be tricky, and games still won't be EXTREMELY complicated (as is the point of the engine), but having more storage is KEY to making complete games!

As a test, I wrote a text file (i.e. ascii-encoded string bytes) and, using a quick Python script, I converted it to a Pico-8 cart.

Pico-8 Cartridge Text Format:

pico-8 cartridge //
version 18
--Data stored here is inaccessible from the main cart.
--Use this area to describe the stored data instead.
--Data stored here begins at 0x0000 and goes to 0x1fff. 
--It is stored in .p8 as a BACKWARDS hex string, 128 chars by 128 rows.
--e.g. HELLO = 8454c4c4f4 
--Data stored here is from 0x3000 to 0x30ff.
--Its format is the same as the gfx section.
--Data here is 0x2000 to 0x2fff
--It is stored as a normal hex string, 256 chars by 32 rows.
--e.g. HELLO = 48454c4c4f

The three sections above will give you 12,543 bytes of storage per cart, less if you use them for actual graphics and maps. Multiply that by 15 possible storage banks gives you 1.8 megabytes of non-standard storage, and that doesn't include sfx and music!

As a note:
The __sfx__ and music blocks are less easy to make use of. A typical sfx test string looks like this within a .p8 file:
But when you peek the first 10 bytes of SFX ROM @ 0x3200, the values returned are:
63 10 63 10 63 10 63 10 63 10
3f corresponds to 63, then there are 3 characters in between (050) that equal 10 in decimal. Storing and retrieving data from a format like this may be too inefficient or impractical.

In Python, converting byte data to a hex string is fairly easy:

file = open("input.bin", 'rb') # Data to convert
by =               # Read all at once
file.close()                   # Close i/o stream
bstr = hex(by[0])              # First byte to hex string
byh = bstr[2]                  
byl = bstr[3]
outbyte = byl + byh            # Rearrange the characters

Iterate the above and paste it into a cart file - then by reading location 0x0000 of the new file (if located under __gfx__), you can convert to string data and print it:

The base cart just does this:

for i=0,250 do
 if c=='\\' then
 elseif c~=nil then

(chr() function is defined in the link above). The if block converts any backslash found in the data to a newline character. 

The peek and poke in the screenshot show that the string is actually living in user RAM.

My python tool is very messy (as mine always are!) but it will generate a full cartridge file, warn you if your input data is too large, and fill out all rows to the proper length. You can check out the source here.

Tuesday, February 11, 2020

ZX Spectrum: Detecting in assembly 48k or 128k model

Detecting machine capabilities is just a matter of course in the MSX world. However, in Spectrum land, there wasn't much crossover with 48 and 128k games. Many just came on seperate tapes (or seperate sides of the tapes) and did not share code.

Some cleverly programmed ones, like Avenger, could detect and run the proper loader.

I tried to disassemble Avenger, but either the dump was bad (it wouldn't load the 128k version) or it uses some trickery I couldn't read. Either way I gave up and searched for my own way.

I couldn't find any discussion on this topic on the net, so I was left to my own devices. I came to realize one clear benchmark for 128 machines is the AY chip. As far as I can tell, no 48k machines had one, and every 128k machine did. Perfect!

Well, I tried a routine that polled the AY I/O port, but it doesn't seem to work. What I did not know is that unbound I/O ports will return floating values - about half the time it returns the value you're checking it against. This makes for very unreliable testing.

The other option is memory paging. I THINK this is what Avenger does - it definitely changes the ROM page to the 48k ROM. I did the following instead:

1. Switch the ROM to page 0 - this is never the 48K ROM on any system, and this code will do nothing on a 48K.

2. Read a byte from the ROM I know is only in 48K - The letter "1" from the string "(C) 1982 ..." should work. There is only one version of the 48K ROM, so unless there's something wrong with the system or emulator, this location in RAM (0x153b) should ONLY return '1' on a 48K system.

3. Compare against 0x31 ("1"), and if it differs, we must be on a NON-48K system. In other words, a 128K system (or a 16K, but hopefully nobody will try to run a 48/128 game on a 16K system).

The code looks like this:

As a side note, a secondary check if you REALLY want to make sure you're not on a 16K should be fairly trivial - just find a string byte that is only in that ROM.

Since I can't find any info on this subject, anyone more knowledgeable is welcome to provide alternate solutions - but for now I like this one.

Side note, the gorgeous color scheme is Cobalt in gedit plus the z80 highlight scheme I found on (install it to a -3.0 folder, not 2.0 like the Readme says).

Saturday, February 8, 2020

The super annoying Speccy VRAM map and pattern printing

The common way to explain the layout of the ZX Spectrum's pixel orientation on its bitmapped VRAM is often quite convoluted and is oriented towards the values of each bit of the VRAM address - useful for plotting single pixels, but not for batch operations.

The Speccy VRAM can be visualized in a few ways to help understand how it's laid out:

1) Similar to an MSX, the ZX has 3 sets of 256x8x8 blocks arranged in a 32x24 grid. From $4000-$47ff is the first set, $4800-$4fff is the second, and $5000-$57ff is the third.

2) Pixel data is oriented in VRAM as if it were a 2048x24 bitmap (with each byte representing 8 pixels for 256x24 bytes), then the 8x8 tiles were scrunched into 256x192.

Add 1 to H, every 8 add 32 to L and reset H.
  (if L rolls over, add 8 to H.)
Add 1 to L.

This layout can do a couple things with the target VRAM address:

1. inc l will increase the pixel X position across 8 rows (256 bytes per page / 32 columns = 8 rows)
2. inc h will increase the pixel Y position within the first 8 rows, plus the row offset from the l register.
3. Flooding VRAM with patterns is really easy and fast:

    ld hl, $4000    ; VRAM base
    ld b, 12        ; 2 rows per loop * 12 = 24 rows


    ld a, %01010101  ; pixel pattern row 1
    ld [hl], a       
    inc l             
    jr nz, .loop_a   

    inc h            
    ld a, %10101010  ; pixel pattern row 2
    ld [hl], a
    inc l 
    jr nz, .loop_b

    inc h           

    dec b
    jr nz, .printloop

ZX Spectrum: 1942 loader detokening and .TAP format assembly

.TAP format is much easier to work with than .TZX, which seems to mainly be for duplication.

.TAP structure is simply a series of headers and file data to create a file listing. Header-data, header-data, header-data. A header is always 19 bytes long, but the length of the data block can be up to 64k.

Visualized, it looks like this:
|   TAP block header   |
|                      |
|    Header data       |
|       (19 bytes)     |
|                      |
|---<Checksum byte>----|
|   TAP block header   |
|                      |
~     Data bytes       ~
|                      |
|                      |
|---<Checksum byte>----|
for each file on a tape.

Each header and data block has its own 3-byte mini-header as specified by the .TAP format. It's very simple:

; 3 byte block header:
DW BlockSize
DB BlockType
; then data
;  (...) followed by 
DB ChecksumByte

If the BlockType is 0x00 (indicating a header block), then BlockSize will always be 19 (in sequence 13h 00h). Headers are 17 bytes long, and the BlockType and ChecksumByte are added to the BlockSize length to get 19. 
If BlockType is 0xFF (255 or -1, indicating a data block), then BlockSize is the size of the data block (plus two bytes for BlockType and ChecksumByte).

Header blocks look like this:

DB FileType
DW DataSize
DW Parameter1
DW Parameter2

FileType can be 0, 1, 2 or 3. BASIC data can be stored as types 1 or 2, but we are concerned with type 0 -- BASIC program -- and type 3 -- CODE (aka assembly).

The filename must be padded with 0x20 to 10 bytes.

DataSize is the size of the data to load. All I know is that this value is generally 2 less than the BlockSize in the following data header.

When FileType is BASIC program (0):
Parameter1 is the LINE parameter when SAVEing the program. I actually have not gotten this to work, and since 1942 keeps it at 0, I do as well.
Parameter2 is the location of the start of the working area of BASIC variables. A bootloader generally does not have variables, so in these cases, this value is actually the same as BlockSize (or, DataSize+2).

When FileType is CODE (3):
Parameter1 is the target memory address (e.g. the first parameter after CODE), and
Parameter2 is ALWAYS 8000h. Not explained why.

And finally, the checksum byte. This isn't a checksum per se, so much as it is a bit toggling of all the bytes in the block, minus the header (including FileType). Start with the FileType flag byte and xor it with each successive byte, then store the final result in ChecksumByte.

For the data block, this is calculated for me using a Python script post-assembly with the following code:

chk = 0xff  # start with flag byte
i = -1      # which is one byte behind
while i < len(inbytes)-1:
    chk = chk ^ inbytes[i+1]
    i += 1

And of course data blocks are simply raw data.

The trick was getting a BASIC stub to auto-run when you play the tape (harder than it seems when you're doing all the bytes by hand) and have that stub clear RAM and load/run the assembly program we want.

I couldn't figure out how to save a .TAP from the "speccy" emulator, so I had no choice but to open up a .TAP of 1942 and see what was up.

The first TAP header block in 1942 looks like this:
DW $0013          ; size in bytes
DB 0              ; type 0 = header
; then the header data:
DB 0              ; 0 = BASIC program
DS "1942      "   ; filename
DW 185            ; file size
DW 0              ; autostart line
DW 185            ; basic vars loc
DB $0f            ; checksum byte

Then, the data block. This is where I had to detokenize the program by hand, and figure some stuff out for myself.

The listing of the 1942 loader ended up looking like this:
10 BORDER 0:POKE 23624,0:POKE 23693,0:CLEAR 25592:POKE 23739,111 
40 POKE 23739,244:RANDOMIZE USR 25593
50 REM etc

1942 loads itself into $63f9 - contended memory, but a good starting point all the same.
As a point of interest, 23624 ($5c48) is BRDCLR, 23693 ($5c8d) is ATTR_P, and 23739 ($5cbb) is CURCHL. These correspond to a border color mirror, an attribute byte I need to investigate, and the currently selected IO channel.
This, along with the .TAP disassembly, was enough to get me started -- the basics are use CLEAR n-1, LOAD "" CODE, and RANDOMIZE USR n.

(Note that the best way to check the value of a token in any native BASIC version is to use PRINT CHR$(n). BASIC tokens don't overlap the standard ASCII byte space, so n is almost always > 127.)

First, explaining the Spectrum BASIC line format:
DW LineNo        ; Big-endian!
DW LineOffset    ; Bytes until next LineNo
( ... )          ; (listing)
DB $0d           ; endline

The important thing here is that the single 0x0d byte represents endline in Spectrum BASIC. ZX80/81 use a different endline (0x76, maybe?). Don't look for 00 00 as endline or 00 00 00 for EOF like on other systems - afaict there is no concept of EOF in ZX BASIC.

A large difference between Sinclair and other BASICs is that Sinclair wastes a ton of space on storing numbers as strings, but condenses all spaces automatically. Here is the hex listing for my very short loader program:

13 00 00 00 4C 4F 41 44 45 52 20 20 20 20 2C 00
00 00 2C 00 11 2E 00 FF 00 0A 0D 00 FD 32 35 35
39 32 0E 00 00 F8 63 00 0D 00 14 05 00 EF 22 22
AF 0D 00 1E 0E 00 F9 C0 32 35 35 39 33 0E 00 00
F9 63 00 0D 70

And the corresponding BASIC:

10 CLEAR 32767

CLEAR 32767 This sets BASIC's HIMEM to 7fffh. Doing this tells the ZX that the next time we load bytes from tape, they should go to the byte after this address (8000h).
LOAD "" CODE This is equivalent to "Load the next file available from tape as an assembly program (to the lowest point in memory I've allotted)". This will load the next chunk of data pointed to by a .TAP header in the .TAP file as a binary to 8000h.
RANDOMIZE USR 32768 This is, for some reason, the common way to start machine language routines on the speccy. This is equivalent to "JP $8000".

The tricky part here is that immediately after string numerical constants (which are stored as ASCII), they are followed with byte 0x0e (integral modifier byte) and then stored as:
DB 0
DB PolarityByte
DW IntValue
DB 0
Such that 25592 becomes 11 bytes(!!):
32 35 35 39 32 0e 00 00 f8 63 00

Also, the final line, RANDOMIZE USR n, is the sequence of bytes f9 c0. Whenever you see this in Spectrum BASIC its a CALL/JP command.

As mentioned above, I wrote a Python script to calculate the checksum bytes for me. Running it on the BASIC stub binary and a compiled asm binary I resulted in two .TAP files: one for the BASIC loader, and one for the hello world program.

.TAP is brilliant because you can $cat a.tap b.tap > ./c.tap and suddenly have a complete tape file. I tested b.tap with this binary code, assembling as-is with nothing else:

%org $8000

    xor a 
    ld [WorkRAM_a], a
    ld a, [WorkRAM_a]
    inc a 
    cp 8
    jr nz, .ok
     xor a
    ld [WorkRAM_a], a
    out [ZX_IOPORT], a
    jp Loop

WorkRAM_a: rb 1

And it worked! With a little bit more work and bash nonsense, I have a one-click script that will assemble a ZX Spectrum .TAP image (with loader!) to an address I specify from a single assembly listing.

Comment if you are interested in learning more.

Friday, January 31, 2020

Wait for VSYNC in DOS (inline assembly)

#define VID_STATUS 03dah      
#define VSYNC_MASK 00001000b  

int main()
    while(1)         // forever loop
            mov dx, VID_STATUS
            in al, dx
            test al, VSYNC_MASK 
            jz DrawWait     
    return 0;

This method is mostly elegant, but depending on your game and environment, you may need a second wait for the z flag to be set to ensure the draw routine only runs once. 

There IS another method (thanks: michaelangel007), perhaps not as great for games, but could be used for a variety of other purposes, and that is to reprogram the timer for a value other than 18.2/s. 
Reference link

Thursday, January 30, 2020

Reading keyboard input in DOS w/ Turbo C++ and assembly (Jill of the Jungle, Wolf 3D)

It's fairly easy to find a working copy of Turbo C++ 3.0, and it compiles and links .EXEs without issue in DOSBOX. However a couple problems are immediately apparent.

The CLK_TCK constant is set at install time, and is set to 18.2 on my copy. CLK_TCK being accurate is necessary for clock() to get an appropriate reading -- which it cannot. Since it's impossible to determine a safe value, there needs to be a better way, across all CPU speeds, to measure time at a higher resolution.

Obviously there has to be at the machine level, but there are no higher resolution time-monitor-thingies in TC++3, so that leaves a raw assembly block. HOORAY!! This *IS* actually good news. DOS machines have a standardized BIOS, and DOS runs the x86 instruction set. Any DOS machine anywhere will be able to run our machine language routines.

This is way easier than writing assembly routines that work across all Windows and UNIX systems. Blegh.


Cycle rate in DOSBOX is how many instructions to attempt to perform *every millisecond*. This is not exactly analgous to cycles/second. If you are targeting a certain CPU, like I am (a 16MHz 386 sounds appropriate), then you should look up how many MIPS that CPU can perform and divide it by 1000. For my purposes, 2000 cycles (or approximately 2 MIPS) is roughly the 16MHz 386DX target -- the top CPU of 1985. In my DOSBOX config, I set CPU type to 386 and cycles to fixed 2000.

Now that my CPU speed is set properly, let's spit out a quick program that will hook into the keyboard interrupt -- note this is THE ONLY ACCEPTABLE WAY to read keyboard input!!

#include <iostream.h>
#include <dos.h>

unsigned char KEYSCAN;

void interrupt ( *oldkb)(...);

void interrupt newKb(...)


int main()
        cout << KEYSCAN;
        if (KEYSCAN == 0x01) { return 0; }
    return 0;

Now, what we want to do is detect whenever a key is pressed or released. The keyboard controller will send an interrupt (in DOS this IRQ is vectored at 09h), and we are expecting a value of 0-255 (the keyboard scan code sent by the peripheral control chip), which we will output using cout as a test. 

DOS standard uses (apparently) either the 8255 PPI, or in the case of PS/2, the Intel 8042. 
I don't know when or what software would actually address the 8042 natively - all I can find regarding DOS keyboard input doesn't seem to ever use 64h (the PS/2 port), only 60/61h (for the 8255). 

Desiring more information, I began to disassemble Jill of the Jungle to figure out how EPIC did keyboard input back in the day. They do indeed use an IRQ hook - they check if the scancode is between 01h and e0h, if the high bit is set (ie a keyup scan code), convert it to a word and store it in memory.

The routine looks really nasty, and it's almost certainly generated assembly (here's a third of it, by hand so it isn't perfect):
    mov bx, bp      ; 8e dd
    sti             ; fb
    xor ax, ax      ; 33 c0=xor Gv Ev ax ax

;5050h:  e4 60 3c e0 75 09 c7 06 60 3a 00 01 eb 24 90 a8
    in al, 60h
    cmp al, e0h     ; 3c=cmp al,[*]
    jnz +09h        ; 75=jnz [r*]
                    ; c7=mov Ev,Iv
    mov ax, [3a60h] ;  6=00|000|110 ax imm
    jmp +24h        ; eb=jmp [r*]
    cbw             ; 90=cbw
    test al, 80h    ; a8=test al,[*]

The bytes E460 are the only possible way to read in from port 60h, that's how I tracked this in the exe. E464 (in the case of PS/2) doesn't exist, nor does E661 or E664 - meaning nothing is ever written to the PPI, only read. Indeed, at 5091h there's
    mov al, 20h
    out 20h, al 
which is the IRQ acknowledgement, followed by 8 POPs (haha, generated code).

While this is still confusing, its clear that the preferred way to read keyboard input for gaming (care of Epic Megagames, 1992) is to write your own interrupt and track the keyboard state on your own. 

So how do we do this? Well, C++'s DOS.H includes exactly what we need. The small program above is the minimum to hook into the keyboard, but we need to be careful with our new assembly routine. With a little inspiration from the Wolfenstein 3D source, we can now write what we need. This goes in the asm {} block:

    in al, 60h
    mov KEYSCAN, al    // Read the KB and store it in KEYSCAN

    in al, 61h
    or al, 80h
    out 61h, al        // flip the top bit of port 61h
    xor al, 80h        //  on and off. This is the keyboard
    out 61h, al        //  "acknowledge" signal

    mov al, 20h
    out 20h, al        // Write 20h to port 20h (IRQ acknowledge)

(asm { } is all that's needed for inline assembly. It just works.) 

Now, in main(), we need to re-orient the BIOS's read key function to our new one. These two lines at the beginning will do what we need:

    oldkb = getvect(0x09);
    setvect(0x09, newKb);

And that's it! 09h points to the address of the DOS BIOS routine we want to overwrite, so we can save that vector for later if we need it with getvect(). setvect() will change that same vector to the interrupt-type method we pass to it, which we've named newKb.

That's it!!! Perfect frame-independant keyboard input obtained.

Now all you have to do is check whether the KEYSCAN is a PRESS event (0-7f) or RELEASE (80-ff) and store it in your input methods. 

Full .cpp file here.

Other sources:
MS-DOS kb scan codes
8086 Opcode chart
MS-DOS EXE file format
8086 MOD-R/M byte information / [2] / byte prefix info

Wednesday, January 15, 2020

MSX Dev, a primer

1. Do I want to code in assembly, BASIC, or something mid-level (C++/Fortran etc)?

This is the most important question. If you're not fluent in Z80, I suggest you learn, because even working with a mid-level language is going to require some knowledge of the hardware. If you are just learning and want to code in BASIC, that's perfectly acceptable, but you need to understand that you won't be able to make anything of any real depth or speed.

When dealing with mid-level compilers you have to deal with their respective quirks and bugs. I am unfamiliar with them, and I find it more fun and satisfying to code in assembly.

2. What kind of environment do I need?

You don't REALLY need anything other than an assembler (tniASM is mine), but VS Code is fairly invaluable for its syntax highlighting and easy build config. I wrote a custom build script in shell that checks for file changes, compiles the new ones, generates a ROM, and launches that ROM in openMSX. I don't need VSC to do that, but when editing in VSC I can run the script with a few keystrokes.

3. What about tools?

This is where I was stuck for a long time, and ended up writing my own. I wrote tools for creating all modes of sprites, bitmaps, and patterns; I disassembled MuSICA, fixed some bugs, repaired it and now I use that as my music driver; and I just threw together a quick tool in BASIC for helping me make sound effects. I can push how great my tools are all day long but, to be honest, most people are going to be so experienced they will want to make their own, too. What you really require depends on the next question, as well...

4. What MSX hardware should I target?

This is the second most important question! There are quite a few standard variants:

RAM: 8/16/64/128k
VRAM: 16/64/128k
VDP: 9918, 9938, 9990
CPU: Z80, R800
Sound: PSG, FM, SCC/+
Format: Cartridge, floppy disk
Joystick: None, 1-button, 2-button
Video: PAL, NTSC

-A good ways into development of Iridescent I realized I could get away with 8kB of RAM, so that's what I'm targeting. 16kB is more 'normal' for MSX-1 games.
-The 9938 is the standard MSX-2 video chip - if you target the 9990 then very few people will get to enjoy your game on hardware.
-The R800 is only available on the Turbo.
-FM and SCC can be on any system, but require expansions. FM is standard on MSX-2+ and many MSX-2 systems, but SCC is cartridge only.
-Carts load much faster than floppy disk, but contain overall less data. Distribution is a bitch and more expensive than floppies. Not everyone has a working floppy drive, as well, and were not standard on MSX-1.
-As a general rule, MSX-1 games used 1 button joysticks and MSX-2 games used either 1 or 2.
Making your game support both 50 and 60hz is tricky.
-IF YOU GENERALLY RUN A PAL MACHINE, CONSIDER DEVELOPING ON NTSC. PAL machines have more cycles per scanline, so if you're using tricky code at 50hz, its possible your game will not run at all at 60hz!
Changing the game and music speed from 60 to 50hz is easier than the other way around!

5. Okay, all set!

Next time, I'll talk about configuring the hardware for your needs... Video modes, expansion slots, binary headers, programmable sound chips ahoy!

Saturday, January 11, 2020

MSX Dev 2020 part 4: 8kB, PSG, 9918

Here is a slowed-down gif of my "smooth scroll" routine:
Difficult to tell from the gif, but it runs at 60fps and only shifts 1 pattern column at a time.

Behind the scenes is much cooler. In instantanous fashion, the data for the next room (700 some bytes) is loaded from a swapped memory page into RAM and cycled into VRAM as shown above.

9918 limitations:
4 sprites per line
1 color per sprite
29 T-states between VRAM writes (!!!)

Monsters are 2x2 patterns, and while there's space for 100 in RAM, there will likely never be that many on-screen at once, regardless, this and projectile flicker allows for 2 players to have 2 colors - and monsters are actually more colorful than sprites would be. The downside is they move in 8 pixel chunks.

The VRAM writes are a big problem. This VDP tutorial is very good, but can be misleading. In my experience, the 9918 VDP ALWAYS has a 29 t-state wait. On the MSX2, you still need a small wait in between reads and writes. You can use ldir "unlmited" during vblank, but be careful. Related, if you compare carefully timed VDP writes DURING vblank on openMSX and real hardware, you will see discrepencies. This is because emulation down to the microsecond is physically impossible. At any rate, when coding for the MSX1, the suggested method of:

    jp nz, outloop

Works very well (due to the exactness of the timing = 29 T-states).

When working with the 9938, the thing to keep in mind is that there is a MINIMUM time of 5-8 T-states between reads and writes. If you are polling VRAM:

    ld hl, (vram_addr_to_read)
    ld a, l 
    out (VDP_STATUS), a 
    ld a, h 
    out (VDP_STATUS), a 
    nop                 ; do nothing for 8 cycles!
    in a, (VDP_DATA)

Otherwise, you will get erroneous graphics.

Using a garden variety of shit (RLE encoding, 3.7kb music player, cartridge paging) I managed to squeeze the requirements for the game down to 8kB of RAM. It looks pretty good so far, I think, and runs on the most bare minimum of hardware.


Thursday, January 2, 2020

PC-8801 d88 disk format for Cosmic Soldier

I worked through the disk for Cosmic Soldier (1985, Kogado) for PC-88 and managed to decipher a lot of it. I wrote a tool to extract the files but it doesn't work on other disks right now.


Cylinders: 40
Sectors per track: 17 (16 in cylinders 1-8, 1 dummy sector in 9-40)
Total no. sectors: 1280 (1344 including dummy sectors)

First 688 bytes: Disk header
 first 16: all zeroes
 [remainder is unknown atm, likely CP/M code?]

Every sector has a 16-byte header for every 256 bytes of data.
Header looks like this:
[C] [T] [S] 01h [B] 10x00h 01h

C: cylinder, 0-27h (0-39)
T: track, 0-1
S: sector, 00-10h (0-16)
(01h byte)
B: side: 10h if cylinder 0-7, 11h if cylinder is 8-27h
   (8 cylinders = 65kb; likely legacy 16bit address limit)
(ten 00h bytes)
(01h byte)

The 4th, 6th and 16th byte will change values on other disks. I am assuming the 01s somehow indicates BASIC programs with a file directory listing.

The extra 17th sector in cylinders 8+ (header no. 10h) is always 256*FF bytes.

Directory listing for the disk is at address 27-01-01 (cylinder 39, track 2, sector 1):
16 bytes per file, extends until 27-01-0d - This means there is room for 192 file indexes.

9 bytes: File name
1 byte: File type
  00: Data A (writeable?)
  01: Data B ?
  80: BASIC program
1 byte: Sector address / 8
5 bytes: FFh

"init     ",80h,4Ch
Means a BASIC program named "init     " starts in sector 608 (4C*8 in decimal)
Sector 608 / 32 = 19, in decimal = 13h; therefore "init     " starts at sector address 13-00-00. Tracking through the disk shows this to be true. This does not include dummy sectors - 608 is the sector count not including dummies.

The listing at sector address 27-01-0D is the disk's auto-exec code.

Final 3 sectors are a mystery for now.

It took quite a bit of searching, but luckily the Z88dk has a listing of BASIC tokens for N88-BASIC, here:

Shouldn't take much work to convert the code back to its original source.