Friday, August 14, 2020

Super fast Raycast 3D engine in LOVE 2D

 (Update: I mistakenly thought during my experimentation that clearing the buffer is necessary each frame. It's not. The article has been updated below)

LOVE is great. Underneath the hood, it's just a Lua interface to render OpenGL textures to a window - which is simple and perfect for any 2D sort of game. 

Well, almost. Older game hardware let you access VRAM (and thus individual pixels) directly and more quickly than modern hardware. We have shaders, but what this allows us to do is GPU-side manipulation on graphics data we've already pushed to the video buffer - it doesn't let us construct data from a tile map, for example. The amount of memory available to the pixel shader is also fairly limited, so you can't use it to extrapolate entire images, for instance.

This means when making retro-style games in LOVE, which a lot of times requires manipulating an image as if it were VRAM or pixel data, can actually be unintuitive, because you're not just drawing layers of static images on top of each other. Every time you change the image, you have to recreate it before you redraw it. LOVE warns against this explicitly, because you can very quickly overflow and crash.

A perfect example of pixel-based rendering is a 'raycasting' engine, like Wolfenstein 3D, or other simulated 3D games like the original Elite which plotted pixels directly to video to draw wireframe models. These games were developed before 3D accelerator GPUs were a thing, so all rendering is done CPU side. 

Writing directly to memory is super fast, so Wolfenstein 3D could run pretty well on a 386. Unfortunately, in frameworks like LOVE which are built on top of Lua, on top of C, it's not so easy to write directly to memory. Lua tables are extrapolated, inferred, blown apart etc., and the performance hit can be obvious when you're polling several values from tables of tables a couple thousand times per frame, sixty times per second!

There is an absolutely amazing raycasting engine tutorial by lodev for C++ available on his website. If you know C++, you can likely recreate Wolfenstein 3D from scratch in just a couple days! I personally have never written a raycasting engine, and wanted to see if it was possible in LOVE. I went through it, and guess what? It works really well! Unfortunately, I hated the performance (60% or higher CPU usage), thanks to using so many tables.

To draw the image, LOVE has ImageData objects, where you can use the :setPixel() method, but this is slow. About as slow as using Lua tables, in fact, so we want to avoid using this. What we can do instead is treat the ImageData as a vram buffer - we 'draw' everything internally, then push it all at once using an appropriate data structure. In this case, we can use the :replacePixels() method once per frame instead - but we still suffer from the data format limitation.

So how do we work around that? Turns out it's more simple than you might think.

LuaJIT, the ultra speedy single-pass, "just-in-time" compiler for Lua that LOVE uses, offers a library called 'ffi'. The ffi library allows you to extend C. I won't talk about how great this is, but instead I offer this tiny piece of code that fixed all my problems:

local ffi = require 'ffi'
ffi.cdef[[
typedef struct { uint8_t r, g, b, a; } pixel;
]]

This defines for us a new usable C struct named pixel that has four components,
r, g, b, a which are each 8-bit integers. What may not be obvious is that once we initialize a variable
of type 'pixel' then it will be a byte-perfect representation (i.e. 4 sequential bytes) within our Lua program.

So this is how it's used:

screenBuffer = ffi.new("pixel[?]", screenWidth*screenHeight)
bufSize = ffi.sizeof(screenBuffer)
drawData = love.image.newImageData(screenWidth, screenHeight, "rgba8",
ffi.string(screenBuffer, bufSize))
drawBuffer = love.graphics.newImage(drawData)
a
This code initializes a struct array of type "pixel" of ? elements where ? is the number you pass as the second
argument. ffi.sizeof() grabs the size in bytes of the object. We need this for the next line, which creates a new
LOVE ImageData object in the correct format (rgba8, or an 8-bit series of red, green, blue, and alpha for each
element). ffi.string() will coerce the parameter passed, screenBuffer, to a char * of size bufSize. Finally, the
drawBuffer image (what is actually put on the screen) is initialized from this ImageData.

Don't actually do this:

Don't forget to clear the screenBuffer every frame:
for i=0,(screenWidth*screenHeight) do
screenBuffer[i].r = 0
screenBuffer[i].g = 0
screenBuffer[i].b = 0
screenBuffer[i].a = 0
end
You don't need to clear the screen buffer if you're tracing the entire ceiling and floor every frame. Omitting
this will cut out a lot of cycles!

Then, whenwriting pixels to the screen, you do this instead:

local c = textureData[texNum][(ty*textureSize)+tx]
local r, g, b = c.r, c.g, c.b
local px = math.floor((y*screenWidth)+x)
screenBuffer[px].r = r
screenBuffer[px].g = g
screenBuffer[px].b = b
screenBuffer[px].a = 255

Where 'c' is 'color' in the original source linked above, and you write to the screenBuffer indexed linearly
instead of a two-dimensional array. Setting the color and alpha to 0 at the beginning of each frame is the
equivalent of clearing out the graphics buffer.

When converting the math from C++, be veeeery careful of order of operations (the example source is
very lazy) and of data types. Also be aware that LuaJIT, as mentioned above, is one-pass: what this means
in this case is that inline expressions are evaluated when they are encountered, and not before. So, for
expressions that are repeated, factor them out of loops as much as possible.

The other big thing to watch out for is expressions inside of array indexes (these MUST be cast to integers
with math.floor() or the math won't work) and variables that are declared as integers. As long as these are
truncated with math.floor() or some similar operation, then you'll be fine. But if you don't, you'll see results
like this and pull your hair out trying to figure out why:


Once you've done your draw code, you gotta create a new ImageData from the screen buffer byte string. This
is where it gets a little tricky:
drawData = love.image.newImageData(screenWidth, screenHeight, "rgba8",
ffi.string(screenBuffer, bufSize))
if drawData then
drawBuffer:replacePixels(drawData) end
drawData = nil

Remember drawBuffer is the LOVE type image. We only need drawBuffer at the end of our processing,
meaning drawData can be nil'ed out and garbage collected.

The purpose of doing this is unclear, so I'll try to explain. LuaJIT's garbage collection doesn't run super often,
and doesn't go super deep. This is to prevent impacting performance. Unfortunately, the use case of LOVE
where you're generating possibly upwards of 100mb of graphic data per frame WILL cause your game to
overflow and crash (at approx. 2GB).

Nil pointers should get cleaned up during garbage collection and free up that RAM, but we need it to run
faster than it is. Eventually this small app will take several hundreds of megabytes of memory, and we can
certainly trim that down. The solution is remarkably simple:

function love.update(dT)
secondCtr = secondCtr + dT
if secondCtr > 1.0 then
secondCtr = secondCtr - 1.0
collectgarbage()
end
...

collectgarbage() is a native Lua method that when called with no arguments will perform a full garbage
clean. If you call it as collectgarbage('count') you'll get the memory consumption in kilobytes of Lua (minus
the 50-60MB or so of RAM that LOVE takes up). If you like you can fine-tune this for minimal impact.

At the end of the day you should end up with a result like this that stays around 10-15% CPU and under
100 MB of RAM, depending on your graphics (I even added a sprite):



Tuesday, July 28, 2020

Non-VR (desktop 3D) game template for LOVR

main.lua:

camera = nil
cameraPosition = { x = 0.0, y = 0.0, z = 0.0 }
cameraTarget = { x = 0.0, y = 0.0, z = -1.0 }

function lovr.update()
    camera = lovr.math.newMat4():lookAt(
        vec3(cameraPosition.x, cameraPosition.y, cameraPosition.z), 
        vec3(cameraTarget.x, cameraTarget.y, cameraTarget.z))
    view = lovr.math.newMat4(camera):invert()
end

function lovr.mirror()
    lovr.graphics.clear()
    lovr.graphics.origin()
    lovr.graphics.transform(view)
    renderScene()
  end

function lovr.draw()
    -- Do nothing
end

renderScene = function()
    lovr.graphics.print('hello world', 0, 0, -3)
end

conf.lua:

function lovr.conf(t)
    t.identity = 'LOVR Non-VR Boilerplate'
    t.modules.headset = false
end

Thursday, July 23, 2020

Simple lighting for LÖVR (Phong model in GLSL)

[ Complete source linked at end ]

LÖVR is amazing. You should be using it. (This assumes basic knowledge of it, Lua, or at least Love2D/Pico-8 and related project structure).

However, lighting is tricky for the uninitiated. There are no lighting prefabs or constructors -- you must do it all by hand. Luckily, it's not that hard! I shall attempt to explain what I've learned in the last few days.

We've been spoiled by applications that create "lights" for us, so we think of them as objects that cast light within the rendering space. This is not how lighting is done for most video games - casting light itself in a realistic way is extremely GPU-consuming.

What many 3D games do is they will process the color of each pixel on the screen (called a 'fragment' in shader language) based on the angle, distance, and color of the rays of light hitting it, and what color the texture is (if any).

This is done in three phases, in a very common lighting model called the Phong model.

(This tutorial was adapted from the very well-written LearnOpenGL tutorial in C++, found here).

Assuming you already have a project set up and are loading and displaying a model, let's try initializing a custom shader first. To do that, we write a slightly modified OpenGL .vs (vertex shader) which we store as a multi-line string in Lua:

customVertex = [[
    vec4 position(mat4 projection, mat4 transform, vec4 vertex)
    {
        return projection * transform * vertex;
    }
]]

Note for now this is just the default LÖVR vertex shader as listed in the 0.13 documentation.

Now, we define a new shader with customVertex:

shader = lovr.graphics.newShader(customVertex, nil, {})

Note that for the newShader method, passing nil will use the default. Now, to enable the shader, we add to lovr.draw():

lovr.graphics.setShader(shader)

(You may have to setShader() to reset the shader at the end of draw() if you have any issues).
If you run this as-is, it should perform exactly as if you had the default shader. Let's do the same thing for the fragment shader:

customFragment = [[
    vec4 color(vec4 graphicsColor, sampler2D image, vec2 uv) 
    {
        return graphicsColor * lovrDiffuseColor * vertexColor * texture(image, uv);
    }
]]

Changing nil in the newShader line to customFragment should again run with no issues.

Now let's get to ambient lighting!

Phase One

Step one of the Phong model is ambient lighting. Light bounces around everywhere, especially in the daytime, and even rooms without lights can be well-lit. You will likely change your ambient level frequently during the game, so being familiar with its affect on your scene is important.

The default LÖVR shader is "unlit", which means effectively your ambient lighting is at 100% all the time - all angles of all polygons are always fully bright. This is fine for certain things, but for rendering a 3d model in a virtual space, shading is pretty important. For our purposes, we are implementing ambient lighting by "turning down" this unlit effect to about 20% - a good value for rooms in the daytime, but you may find 10% or 30% more to your liking.

Here's the new fragment shader:

customFragment = [[
    uniform vec4 ambience;

    vec4 color(vec4 graphicsColor, sampler2D image, vec2 uv) 
    {
        //object color
        vec4 baseColor = graphicsColor * texture(image, uv);

        return baseColor * ambience;
    }
]]
shader:send('ambience', { 0.2, 0.2, 0.2, 1.0 })

We changed a bit here. First, we added a new 'uniform' variable to represent the ambient light color. Uniform is a keyword that allows us to expose values through the LÖVR / Lua interface, so we can change them freely. We do this with the shader's :send method. Assigning a value to the uniform variable in this way is 'safe' programming - if you try to assign a value to a uniform variable on Android within it's declaration in the shader, the game will crash and complain. I set this value to a dark grey. The values correspond to R, G, B, A - though for this case you generally want the alpha value to be 1.0, otherwise anything drawn with this shader will be rendered as transparent.

Second, we are changing a lot about the value being returned.

The original code has graphicsColor (the value of lovr.graphics.setColor()) being multiplied by lovrDiffuseColor - this is a value of { 1.0, 1.0, 1.0, 1.0 }, but for simplicities' sake, I figured let's just not use this value (it's stored in a hidden shader header) and use our own.

Second, we don't need the vertexColor. This is another value which defaults to 1 that is separate from our draw color, and the texture color, and our new ambience color.

This should be a wee bit faster than it was, one would hope, by omitting a few unneeded variables. If you run your game, everything should look -considerably darker- than before. This is good! Now we layer on the diffuse lighting!

Phase Two

A group of vertices is, of course, a polygon. A ray emitting perpendicular from this polygon is the 'normal'. Depending on the angle of the position of the light versus the normals of your in-game models, the polygons are applied a percentage of the light cast. This makes sense and can be easily proven in the real world - the side of a box facing a light is brighter than the sides, which are brighter than the side facing away, etc.

Diffuse lighting simulates some of the bounce effect that ambient lighting does, with added bias on polygons perpendicular to the light source. 

To do this properly, we need to get the position of and normal of the vertex from within the vertex shader -- this means taking a 3d vector that comes "out of" the polygon -- and passing it to the fragment (pixel or color) shader so we know how "bright" to render that spot on the screen. 

The math for all of this is much better explained and proofed elsewhere, including the LearnOpenGL link above, but rest assured it has been done and triple checked a million times by a million people. What we need to know is how to do it in LÖVR!

Luckily, LÖVR loves you, and makes this very easy. Here's the new vertex shader:

defaultVertex = [[
    out vec3 FragmentPos;
    out vec3 Normal;

    vec4 position(mat4 projection, mat4 transform, vec4 vertex) 
    { 
        Normal = lovrNormal * lovrNormalMatrix;
        FragmentPos = vec3(lovrModel * vertex);
        
        return projection * transform * vertex; 
    }
]]

out is a keyword that simply passes the variable along to the fragment shader when the vertex shader is done. Doing this allows us to use the fragment position in world space and the vertex's normal to calculate our lighting changes. 

[ Special note: Casting and converting vec3 and vec4 can be annoying. Luckily, GLSL makes this easy by allowing a special .xyz method on vec4 variables that will do this for us, e.g. we could have done: FragmentPos = (lovrModel * vertex).xyz instead and it would perform the same. ]

In LÖVR, lovrNormal is defined as the vertex's normal, if one exists. Easy - already calculated for us! The reason why we multiply it by lovrNormalMatrix is so that we can get the normals applied to the model's transform - i.e. the position and rotation of the model as well. 

FragmentPos is less self-explanatory, but what we need to know is that this represents the xyz component of the current vertex of the currently being rendered model (of type lovrModel). In other words, a single visible point on the model. 

Now the important part, using that data on our fragment shader:
defaultFragment = [[
    uniform vec4 ambience;
    
    uniform vec4 liteColor;
    uniform vec3 lightPos;

    in vec3 Normal;
    in vec3 FragmentPos;
    
    vec4 color(vec4 graphicsColor, sampler2D image, vec2 uv) 
    {    
        //diffuse
        vec3 norm = normalize(Normal);
        vec3 lightDir = normalize(lightPos - FragmentPos);
        float diff = max(dot(norm, lightDir), 0.0);
        vec4 diffuse = diff * liteColor;
                        
        vec4 baseColor = graphicsColor * texture(image, uv);            
        return baseColor * (ambience + diffuse);
    }
]]
shader:send('liteColor', {1.0, 1.0, 1.0, 1.0})
shader:send('lightPos', {2.0, 5.0, 0.0})

The math and reasoning for this is explained in the LearnOpenGL tutorial, so here's the important bits for LÖVR:

- liteColor is a new uniform vec4, of values RGBA, that represents the individual light's emissive color
- lightPos is the position in world space the individual light emits light from 
- in is used here to indicate the variables we want from the vertex shader
- normalize() is an OpenGL function to make operations like this easier
- we are now returning the baseColor of the fragment times ambience PLUS diffuse - be sure these are added, not multiplied together

If you compile and run now, you should notice a bright light illuminating your scene. Experiment with variables and using the 'send' method (shader:send('liteColor', <new color table>) or shader:send('lightPos', <new position>)) in your draw() loops.

Almost there!!

Phase Three

Specular lighting does the least changes to individual pixels, but amounts to the most detail. For this implementation, we will be using view space, i.e. x y z of 0, 0, 0, for ease of calculation. If you read the accompanying tutorial, you know that performing these calculations in world space is more realistic. I'm sure you can think of games that use view space calculations -- ones in which the specular light reflections sort of followed your eyes as you moved. Now you know why!

We don't need to make any changes to the vertex shader, so here's the final fragment shader:

defaultFragment = [[
    uniform vec4 ambience;

    uniform vec4 liteColor;
    uniform vec3 lightPos;
    in vec3 Normal;
    in vec3 FragmentPos;

    uniform vec3 viewPos;
    uniform float specularStrength;
    uniform int metallic;
        
    vec4 color(vec4 graphicsColor, sampler2D image, vec2 uv) 
    {    
        //diffuse
        vec3 norm = normalize(Normal);
        vec3 lightDir = normalize(lightPos - FragmentPos);
        float diff = max(dot(norm, lightDir), 0.0);
        vec4 diffuse = diff * liteColor;
            
        //specular
        vec3 viewDir = normalize(viewPos - FragmentPos);
        vec3 reflectDir = reflect(-lightDir, norm);
        float spec = pow(max(dot(viewDir, reflectDir), 0.0), metallic);
        vec4 specular = specularStrength * spec * liteColor;
            
        vec4 baseColor = graphicsColor * texture(image, uv);            
        return baseColor * (ambience + diffuse + specular);
    }
]]
shader:send('liteColor', {1.0, 1.0, 1.0, 1.0})
shader:send('lightPos', {2.0, 5.0, 0.0})
shader:send('ambience', {0.1, 0.1, 0.1, 1.0})
shader:send('specularStrength', 0.5)
shader:send('metallic', 32.0)
shader:send('viewPos', {0.0, 0.0, 0.0})

viewPos at (0, 0, 0) is fine for a static camera, but we're doing VR, after all! If you have a headset connected, feel free to add this in lovr.update:

function lovr.update(dT)
    if lovr.headset then 
        hx, hy, hz = lovr.headset.getPosition()
        shader:send('viewPos', { hx, hy, hz } )
    end
end

[ Special Note 2: The viewing position (not as much angle) is very important for the effectiveness of specular light. If you move the camera with the WASD keys in the desktop version of lovr (as in, you are running without a headset) then the lighting effect won't look very good. For testing without a headset, in this example, it's best to keep the camera in one position, and rotate it. ]

specularStrength is the 'harshness' of the light. This generally amounts to how sharp or bright the light's reflection can look.

metallic is the metallic exponent as shown in the LearnOpenGL tutorial. This value should probably range from 4-256, but 32 is fine for most things. 

The rest of the math hasn't changed - we're just adding the specular value to the final fragment color. 

And that's it! With any luck, you'll have a properly-lit model like so (lightPos at 2.0, 5.0, 0.0):


There's lots of playing around you can do - experiment with multiple lights, new shaders that are variants on the theme, and explore GLSL. 

[ Special Note 3: For factorization purposes, you can keep the vertex and fragment shader code in seperate files (default extension for them is .vs and .fs). You can use the lovr.filesystem.read() command to load them in as strings just like above. The advantage of this is using syntax highlighting or linting when coding your shaders i.e. in VS Code. ]

COMPLETE SOURCE HERE

This will work on your Quest or Go as well if you follow the instructions on the LÖVR website for deploying to Android. I added a moving, unlit sphere in the example to represent the light source to better visualize it.

Final Note: If you are having issues with some faces on your models not being lit properly, there are a few things to check on your model. 
-First, make sure it is built with a uniform scale. This can easily be done in Blender by selecting a properly scaled piece, then A to select the entire model, then Cmd+A (Apply) -> Scale. There is also the uniformScale shader flag, which gives a small speed boost - you should be developing everything in uniform scale in VR anyway!
-Second, all model faces need to be facing the correct way to generate their normal properly for lighting. If you notice some parts of your model are shading in the opposite direction, you can flip the face direction in Blender by selecting it all in edit mode, then Opt+N > Recalculate Normals or Flip Normals. 
These two tips should fix 90% of any issues! ]

Have fun with LÖVR!

Friday, May 15, 2020

Love2D - Simple event stacks with Lua

Love2D and its 3D/VR companion LOVR are great. I won't blab about how awesome they are - though having an entirely open framework means certain things must be built from scratch. One such thing is an event handling system.

Engines like Unity use a class inheritance to handle this. Every object in a scene is a GameObject, which has an inherited update method to process itself every frame.

It's possible to do this without much trouble by architecting all of your game entities in a similarly OOP way, but this isn't always intuitive, and can cause unnecessary headache and overhead if your game isn't overly complex, or you want more manual control over your event stacks.

Here's a super simple event stack example using anonymous functions and an event stack table (named 'queue'):

table.insert(queue, function() <code> end)

The most frequent use would likely be to add a global wait in between code blocks:

table.insert(queue, function() wait = 1 end)

This also makes calling functions with parameters and so forth very simple:

table.insert(queue, function() 
        sfx:play()
        ComplexFunction(a, 'b', { c = 0 }) 
    end)

Then in update:

love.update(TimeDelta)
    if wait > 0 then
        wait = wait - TimeDelta
        love.draw() -- Continue to draw, but don't process stack
        return
    end
    if #queue > 0 then
        if type(queue[1])=='function' then
            local f = queue[1]
            table.remove(queue, 1)
            f()
        end
    end
end

This code is the basis for most of the animation in my game, or when there needs to be a timed wait e.g. to suspend input tracked by variable named inputEnabled for one second:

function q(o) table.insert(queue, o) end
function setinput(tf) inputEnabled = tf end
q(function() setinput(false) end)
q(function() wait = 1 end)
q(function() setinput(true) end)

Lua allows lots of room for freedom in styling your code however you wish.

Sunday, March 8, 2020

Multi-cart data storage on Pico-8

If you've played with Lexaloffle's Pico-8 for a little while, the limitations of the cart storage - not for graphics or sound, but for code and raw data (esp. tokens) - become a bottleneck very quickly.

Multiple cart support has been added to emulate a form of bank-switching, but it is implemented in a way that purposefully blocks your ability to write more code. The memory locations 0x4300 to around 0x6000 cannot be READ or WRITTEN - this is fairly illogical, because memory locations that cannot be either read or written can't really exist. 

You can, however, repurpose cartridge data to store byte data you create - you just have to know how to store it. The data in the cartridge is effectively hex strings in a specific order. Knowing this, we can write a quick tool to convert data we want to store into Pico-8's cartridge text format. 

We can then read it into the fairly large "user data" area of RAM at 0x4300 (in cartridge, this contains our code) and use it as we will. Loading takes a second, so you probably want to load in as much data as you can at once (i.e. entire towns, etc).

You can programatically store all sorts of data, and use your original cart as a sort of kernel. It will certainly be tricky, and games still won't be EXTREMELY complicated (as is the point of the engine), but having more storage is KEY to making complete games!

As a test, I wrote a text file (i.e. ascii-encoded string bytes) and, using a quick Python script, I converted it to a Pico-8 cart.

Pico-8 Cartridge Text Format:

pico-8 cartridge // http://www.pico-8.com
version 18
__lua__
--Data stored here is inaccessible from the main cart.
--Use this area to describe the stored data instead.
__gfx__
--Data stored here begins at 0x0000 and goes to 0x1fff. 
--It is stored in .p8 as a BACKWARDS hex string, 128 chars by 128 rows.
--e.g. HELLO = 8454c4c4f4 
__gff__
--Data stored here is from 0x3000 to 0x30ff.
--Its format is the same as the gfx section.
__map__
--Data here is 0x2000 to 0x2fff
--It is stored as a normal hex string, 256 chars by 32 rows.
--e.g. HELLO = 48454c4c4f

The three sections above will give you 12,543 bytes of storage per cart, less if you use them for actual graphics and maps. Multiply that by 15 possible storage banks gives you 1.8 megabytes of non-standard storage, and that doesn't include sfx and music!

As a note:
The __sfx__ and music blocks are less easy to make use of. A typical sfx test string looks like this within a .p8 file:
000201003f0503f0503f0503f0503f050...
But when you peek the first 10 bytes of SFX ROM @ 0x3200, the values returned are:
63 10 63 10 63 10 63 10 63 10
3f corresponds to 63, then there are 3 characters in between (050) that equal 10 in decimal. Storing and retrieving data from a format like this may be too inefficient or impractical.

In Python, converting byte data to a hex string is fairly easy:

file = open("input.bin", 'rb') # Data to convert
by = file.read()               # Read all at once
file.close()                   # Close i/o stream
bstr = hex(by[0])              # First byte to hex string
byh = bstr[2]                  
byl = bstr[3]
outbyte = byl + byh            # Rearrange the characters

Iterate the above and paste it into a cart file - then by reading location 0x0000 of the new file (if located under __gfx__), you can convert to string data and print it:


The base cart just does this:

reload(0x4300,0,250,"test.p8")
ts=""
for i=0,250 do
 c=chr(peek(0x4300+i))
 if c=='\\' then
  ts=ts..'\n'
 elseif c~=nil then
  ts=ts..c
 end
end
cls()
print(ts)

(chr() function is defined in the link above). The if block converts any backslash found in the data to a newline character. 

The peek and poke in the screenshot show that the string is actually living in user RAM.

My python tool is very messy (as mine always are!) but it will generate a full cartridge file, warn you if your input data is too large, and fill out all rows to the proper length. You can check out the source here.

Tuesday, February 11, 2020

ZX Spectrum: Detecting in assembly 48k or 128k model

Detecting machine capabilities is just a matter of course in the MSX world. However, in Spectrum land, there wasn't much crossover with 48 and 128k games. Many just came on seperate tapes (or seperate sides of the tapes) and did not share code.

Some cleverly programmed ones, like Avenger, could detect and run the proper loader.

I tried to disassemble Avenger, but either the dump was bad (it wouldn't load the 128k version) or it uses some trickery I couldn't read. Either way I gave up and searched for my own way.

I couldn't find any discussion on this topic on the net, so I was left to my own devices. I came to realize one clear benchmark for 128 machines is the AY chip. As far as I can tell, no 48k machines had one, and every 128k machine did. Perfect!

Well, I tried a routine that polled the AY I/O port, but it doesn't seem to work. What I did not know is that unbound I/O ports will return floating values - about half the time it returns the value you're checking it against. This makes for very unreliable testing.

The other option is memory paging. I THINK this is what Avenger does - it definitely changes the ROM page to the 48k ROM. I did the following instead:


1. Switch the ROM to page 0 - this is never the 48K ROM on any system, and this code will do nothing on a 48K.

2. Read a byte from the ROM I know is only in 48K - The letter "1" from the string "(C) 1982 ..." should work. There is only one version of the 48K ROM, so unless there's something wrong with the system or emulator, this location in RAM (0x153b) should ONLY return '1' on a 48K system.

3. Compare against 0x31 ("1"), and if it differs, we must be on a NON-48K system. In other words, a 128K system (or a 16K, but hopefully nobody will try to run a 48/128 game on a 16K system).

The code looks like this:


As a side note, a secondary check if you REALLY want to make sure you're not on a 16K should be fairly trivial - just find a string byte that is only in that ROM.

Since I can't find any info on this subject, anyone more knowledgeable is welcome to provide alternate solutions - but for now I like this one.

Side note, the gorgeous color scheme is Cobalt in gedit plus the z80 highlight scheme I found on ticalc.org. (install it to a -3.0 folder, not 2.0 like the Readme says).

Saturday, February 8, 2020

The super annoying Speccy VRAM map and pattern printing

The common way to explain the layout of the ZX Spectrum's pixel orientation on its bitmapped VRAM is often quite convoluted and is oriented towards the values of each bit of the VRAM address - useful for plotting single pixels, but not for batch operations.

The Speccy VRAM can be visualized in a few ways to help understand how it's laid out:

1) Similar to an MSX, the ZX has 3 sets of 256x8x8 blocks arranged in a 32x24 grid. From $4000-$47ff is the first set, $4800-$4fff is the second, and $5000-$57ff is the third.

2) Pixel data is oriented in VRAM as if it were a 2048x24 bitmap (with each byte representing 8 pixels for 256x24 bytes), then the 8x8 tiles were scrunched into 256x192.

ONE PIXEL DOWN:
Add 1 to H, every 8 add 32 to L and reset H.
  (if L rolls over, add 8 to H.)
EIGHT PIXELS RIGHT:
Add 1 to L.

This layout can do a couple things with the target VRAM address:

1. inc l will increase the pixel X position across 8 rows (256 bytes per page / 32 columns = 8 rows)
2. inc h will increase the pixel Y position within the first 8 rows, plus the row offset from the l register.
3. Flooding VRAM with patterns is really easy and fast:

    ld hl, $4000    ; VRAM base
    ld b, 12        ; 2 rows per loop * 12 = 24 rows

.printloop:

    ld a, %01010101  ; pixel pattern row 1
  .loop_a:    
    ld [hl], a       
    inc l             
    jr nz, .loop_a   

    inc h            
    
    ld a, %10101010  ; pixel pattern row 2
  .loop_b:    
    ld [hl], a
    inc l 
    jr nz, .loop_b

    inc h           

    dec b
    jr nz, .printloop