Friday, January 31, 2020

Wait for VSYNC in DOS (inline assembly)

#define VID_STATUS 03dah      
#define VSYNC_MASK 00001000b  

int main()
    while(1)         // forever loop
            mov dx, VID_STATUS
            in al, dx
            test al, VSYNC_MASK 
            jz DrawWait     
    return 0;

This method is mostly elegant, but depending on your game and environment, you may need a second wait for the z flag to be set to ensure the draw routine only runs once. 

There IS another method (thanks: michaelangel007), perhaps not as great for games, but could be used for a variety of other purposes, and that is to reprogram the timer for a value other than 18.2/s. 
Reference link

Thursday, January 30, 2020

Reading keyboard input in DOS w/ Turbo C++ and assembly (Jill of the Jungle, Wolf 3D)

It's fairly easy to find a working copy of Turbo C++ 3.0, and it compiles and links .EXEs without issue in DOSBOX. However a couple problems are immediately apparent.

The CLK_TCK constant is set at install time, and is set to 18.2 on my copy. CLK_TCK being accurate is necessary for clock() to get an appropriate reading -- which it cannot. Since it's impossible to determine a safe value, there needs to be a better way, across all CPU speeds, to measure time at a higher resolution.

Obviously there has to be at the machine level, but there are no higher resolution time-monitor-thingies in TC++3, so that leaves a raw assembly block. HOORAY!! This *IS* actually good news. DOS machines have a standardized BIOS, and DOS runs the x86 instruction set. Any DOS machine anywhere will be able to run our machine language routines.

This is way easier than writing assembly routines that work across all Windows and UNIX systems. Blegh.


Cycle rate in DOSBOX is how many instructions to attempt to perform *every millisecond*. This is not exactly analgous to cycles/second. If you are targeting a certain CPU, like I am (a 16MHz 386 sounds appropriate), then you should look up how many MIPS that CPU can perform and divide it by 1000. For my purposes, 2000 cycles (or approximately 2 MIPS) is roughly the 16MHz 386DX target -- the top CPU of 1985. In my DOSBOX config, I set CPU type to 386 and cycles to fixed 2000.

Now that my CPU speed is set properly, let's spit out a quick program that will hook into the keyboard interrupt -- note this is THE ONLY ACCEPTABLE WAY to read keyboard input!!

#include <iostream.h>
#include <dos.h>

unsigned char KEYSCAN;

void interrupt ( *oldkb)(...);

void interrupt newKb(...)


int main()
        cout << KEYSCAN;
        if (KEYSCAN == 0x01) { return 0; }
    return 0;

Now, what we want to do is detect whenever a key is pressed or released. The keyboard controller will send an interrupt (in DOS this IRQ is vectored at 09h), and we are expecting a value of 0-255 (the keyboard scan code sent by the peripheral control chip), which we will output using cout as a test. 

DOS standard uses (apparently) either the 8255 PPI, or in the case of PS/2, the Intel 8042. 
I don't know when or what software would actually address the 8042 natively - all I can find regarding DOS keyboard input doesn't seem to ever use 64h (the PS/2 port), only 60/61h (for the 8255). 

Desiring more information, I began to disassemble Jill of the Jungle to figure out how EPIC did keyboard input back in the day. They do indeed use an IRQ hook - they check if the scancode is between 01h and e0h, if the high bit is set (ie a keyup scan code), convert it to a word and store it in memory.

The routine looks really nasty, and it's almost certainly generated assembly (here's a third of it, by hand so it isn't perfect):
    mov bx, bp      ; 8e dd
    sti             ; fb
    xor ax, ax      ; 33 c0=xor Gv Ev ax ax

;5050h:  e4 60 3c e0 75 09 c7 06 60 3a 00 01 eb 24 90 a8
    in al, 60h
    cmp al, e0h     ; 3c=cmp al,[*]
    jnz +09h        ; 75=jnz [r*]
                    ; c7=mov Ev,Iv
    mov ax, [3a60h] ;  6=00|000|110 ax imm
    jmp +24h        ; eb=jmp [r*]
    cbw             ; 90=cbw
    test al, 80h    ; a8=test al,[*]

The bytes E460 are the only possible way to read in from port 60h, that's how I tracked this in the exe. E464 (in the case of PS/2) doesn't exist, nor does E661 or E664 - meaning nothing is ever written to the PPI, only read. Indeed, at 5091h there's
    mov al, 20h
    out 20h, al 
which is the IRQ acknowledgement, followed by 8 POPs (haha, generated code).

While this is still confusing, its clear that the preferred way to read keyboard input for gaming (care of Epic Megagames, 1992) is to write your own interrupt and track the keyboard state on your own. 

So how do we do this? Well, C++'s DOS.H includes exactly what we need. The small program above is the minimum to hook into the keyboard, but we need to be careful with our new assembly routine. With a little inspiration from the Wolfenstein 3D source, we can now write what we need. This goes in the asm {} block:

    in al, 60h
    mov KEYSCAN, al    // Read the KB and store it in KEYSCAN

    in al, 61h
    or al, 80h
    out 61h, al        // flip the top bit of port 61h
    xor al, 80h        //  on and off. This is the keyboard
    out 61h, al        //  "acknowledge" signal

    mov al, 20h
    out 20h, al        // Write 20h to port 20h (IRQ acknowledge)

(asm { } is all that's needed for inline assembly. It just works.) 

Now, in main(), we need to re-orient the BIOS's read key function to our new one. These two lines at the beginning will do what we need:

    oldkb = getvect(0x09);
    setvect(0x09, newKb);

And that's it! 09h points to the address of the DOS BIOS routine we want to overwrite, so we can save that vector for later if we need it with getvect(). setvect() will change that same vector to the interrupt-type method we pass to it, which we've named newKb.

That's it!!! Perfect frame-independant keyboard input obtained.

Now all you have to do is check whether the KEYSCAN is a PRESS event (0-7f) or RELEASE (80-ff) and store it in your input methods. 

Full .cpp file here.

Other sources:
MS-DOS kb scan codes
8086 Opcode chart
MS-DOS EXE file format
8086 MOD-R/M byte information / [2] / byte prefix info

Wednesday, January 15, 2020

MSX Dev, a primer

1. Do I want to code in assembly, BASIC, or something mid-level (C++/Fortran etc)?

This is the most important question. If you're not fluent in Z80, I suggest you learn, because even working with a mid-level language is going to require some knowledge of the hardware. If you are just learning and want to code in BASIC, that's perfectly acceptable, but you need to understand that you won't be able to make anything of any real depth or speed.

When dealing with mid-level compilers you have to deal with their respective quirks and bugs. I am unfamiliar with them, and I find it more fun and satisfying to code in assembly.

2. What kind of environment do I need?

You don't REALLY need anything other than an assembler (tniASM is mine), but VS Code is fairly invaluable for its syntax highlighting and easy build config. I wrote a custom build script in shell that checks for file changes, compiles the new ones, generates a ROM, and launches that ROM in openMSX. I don't need VSC to do that, but when editing in VSC I can run the script with a few keystrokes.

3. What about tools?

This is where I was stuck for a long time, and ended up writing my own. I wrote tools for creating all modes of sprites, bitmaps, and patterns; I disassembled MuSICA, fixed some bugs, repaired it and now I use that as my music driver; and I just threw together a quick tool in BASIC for helping me make sound effects. I can push how great my tools are all day long but, to be honest, most people are going to be so experienced they will want to make their own, too. What you really require depends on the next question, as well...

4. What MSX hardware should I target?

This is the second most important question! There are quite a few standard variants:

RAM: 8/16/64/128k
VRAM: 16/64/128k
VDP: 9918, 9938, 9990
CPU: Z80, R800
Sound: PSG, FM, SCC/+
Format: Cartridge, floppy disk
Joystick: None, 1-button, 2-button
Video: PAL, NTSC

-A good ways into development of Iridescent I realized I could get away with 8kB of RAM, so that's what I'm targeting. 16kB is more 'normal' for MSX-1 games.
-The 9938 is the standard MSX-2 video chip - if you target the 9990 then very few people will get to enjoy your game on hardware.
-The R800 is only available on the Turbo.
-FM and SCC can be on any system, but require expansions. FM is standard on MSX-2+ and many MSX-2 systems, but SCC is cartridge only.
-Carts load much faster than floppy disk, but contain overall less data. Distribution is a bitch and more expensive than floppies. Not everyone has a working floppy drive, as well, and were not standard on MSX-1.
-As a general rule, MSX-1 games used 1 button joysticks and MSX-2 games used either 1 or 2.
Making your game support both 50 and 60hz is tricky.
-IF YOU GENERALLY RUN A PAL MACHINE, CONSIDER DEVELOPING ON NTSC. PAL machines have more cycles per scanline, so if you're using tricky code at 50hz, its possible your game will not run at all at 60hz!
Changing the game and music speed from 60 to 50hz is easier than the other way around!

5. Okay, all set!

Next time, I'll talk about configuring the hardware for your needs... Video modes, expansion slots, binary headers, programmable sound chips ahoy!

Saturday, January 11, 2020

MSX Dev 2020 part 4: 8kB, PSG, 9918

Here is a slowed-down gif of my "smooth scroll" routine:
Difficult to tell from the gif, but it runs at 60fps and only shifts 1 pattern column at a time.

Behind the scenes is much cooler. In instantanous fashion, the data for the next room (700 some bytes) is loaded from a swapped memory page into RAM and cycled into VRAM as shown above.

9918 limitations:
4 sprites per line
1 color per sprite
29 T-states between VRAM writes (!!!)

Monsters are 2x2 patterns, and while there's space for 100 in RAM, there will likely never be that many on-screen at once, regardless, this and projectile flicker allows for 2 players to have 2 colors - and monsters are actually more colorful than sprites would be. The downside is they move in 8 pixel chunks.

The VRAM writes are a big problem. This VDP tutorial is very good, but can be misleading. In my experience, the 9918 VDP ALWAYS has a 29 t-state wait. On the MSX2, you still need a small wait in between reads and writes. You can use ldir "unlmited" during vblank, but be careful. Related, if you compare carefully timed VDP writes DURING vblank on openMSX and real hardware, you will see discrepencies. This is because emulation down to the microsecond is physically impossible. At any rate, when coding for the MSX1, the suggested method of:

    jp nz, outloop

Works very well (due to the exactness of the timing = 29 T-states).

When working with the 9938, the thing to keep in mind is that there is a MINIMUM time of 5-8 T-states between reads and writes. If you are polling VRAM:

    ld hl, (vram_addr_to_read)
    ld a, l 
    out (VDP_STATUS), a 
    ld a, h 
    out (VDP_STATUS), a 
    nop                 ; do nothing for 8 cycles!
    in a, (VDP_DATA)

Otherwise, you will get erroneous graphics.

Using a garden variety of shit (RLE encoding, 3.7kb music player, cartridge paging) I managed to squeeze the requirements for the game down to 8kB of RAM. It looks pretty good so far, I think, and runs on the most bare minimum of hardware.


Thursday, January 2, 2020

PC-8801 d88 disk format for Cosmic Soldier

I worked through the disk for Cosmic Soldier (1985, Kogado) for PC-88 and managed to decipher a lot of it. I wrote a tool to extract the files but it doesn't work on other disks right now.


Cylinders: 40
Sectors per track: 17 (16 in cylinders 1-8, 1 dummy sector in 9-40)
Total no. sectors: 1280 (1344 including dummy sectors)

First 688 bytes: Disk header
 first 16: all zeroes
 [remainder is unknown atm, likely CP/M code?]

Every sector has a 16-byte header for every 256 bytes of data.
Header looks like this:
[C] [T] [S] 01h [B] 10x00h 01h

C: cylinder, 0-27h (0-39)
T: track, 0-1
S: sector, 00-10h (0-16)
(01h byte)
B: side: 10h if cylinder 0-7, 11h if cylinder is 8-27h
   (8 cylinders = 65kb; likely legacy 16bit address limit)
(ten 00h bytes)
(01h byte)

The 4th, 6th and 16th byte will change values on other disks. I am assuming the 01s somehow indicates BASIC programs with a file directory listing.

The extra 17th sector in cylinders 8+ (header no. 10h) is always 256*FF bytes.

Directory listing for the disk is at address 27-01-01 (cylinder 39, track 2, sector 1):
16 bytes per file, extends until 27-01-0d - This means there is room for 192 file indexes.

9 bytes: File name
1 byte: File type
  00: Data A (writeable?)
  01: Data B ?
  80: BASIC program
1 byte: Sector address / 8
5 bytes: FFh

"init     ",80h,4Ch
Means a BASIC program named "init     " starts in sector 608 (4C*8 in decimal)
Sector 608 / 32 = 19, in decimal = 13h; therefore "init     " starts at sector address 13-00-00. Tracking through the disk shows this to be true. This does not include dummy sectors - 608 is the sector count not including dummies.

The listing at sector address 27-01-0D is the disk's auto-exec code.

Final 3 sectors are a mystery for now.

It took quite a bit of searching, but luckily the Z88dk has a listing of BASIC tokens for N88-BASIC, here:

Shouldn't take much work to convert the code back to its original source.