Running Chip-8 Programs

Welcome back to this Chip-8 Emulation series. Last time we looked at some basic details of the Chip-8, and how programs are represented.

Now lets look at what it takes to build a very basic emulator. We’ll start by implementing a few basic instructions.

Storing a program

We saw that a Chip-8 program is essentially a list of numbers. These numbers are also called “opcodes”, which can translate between the operation we want to perform, and a number that can be represented easily within a computer.

But how does an emulator find which opcode to run?

The details vary from system to system, but for the Chip-8 the story is simple. To run a program, it is first loaded into system memory, starting at address 0x200. The system will start by running the instruction at 0x200. It will then proceed, mostly sequentially, but some intructions allow for skipping around in the program.

In order for our emulator to mimic the behavior of the system, we need to implement these pieces in software. Let’s start with memory.

Chip-8 Memory

The first piece of the Chip-8 system we’ll implement is memory. Memory provides us access to many locations for storing values. Positions in memory are referred to by their position or “address”. The first position we call zero or 0x00, then count up from there.

While there is no single standard for the Chip-8, most systems provide 0x1000 bytes of memory, or 4096 bytes.

We can implement this easily in our emulator by creating an array of bytes 4096 elements long.

Loading programs

Now that we have reserved some space for our emulated system’s memory, we should load a program there. A chip8 rom file contains the representation of a program suitable for computer consumption. This means the file contains the instructions represented by their opcodes, rather than a more human friendly representation.

You can grab a test program from this nice collection on github. I will be looking at the Sierpinski program below. Go ahead and download a .ch8 file.

We can peek inside a ch8 file with a hex viewer. On Linux and Mac systems, xxd should be preinstalled. On Windows you can use something like HxD. Open the file and you’ll see something like this:

$ xxd Sierpinski.ch8
00000000: 1205 4338 5060 0085 0060 0181 50a3 e6f1  ..C8P`...`..P...
00000010: 1ef0 5560 1f8a 0060 008b 00a3 c2f0 65a3  ..U`...`......e.
00000020: c2da b160 01a3 c3f0 5560 1fa4 06f0 5560  ...`....U`....U`
00000030: 01a3 c4f0 55a3 c3f0 6585 0060 0181 0080  ....U...e..`....
00000040: 5080 14a4 07f0 55a3 c4f0 6585 0060 0181  P.....U...e..`..
00000050: 0080 5080 15a3 c5f0 55a3 c4f0 6585 00a3  ..P.....U...e...
00000060: c5f0 65a3 e6f0 1ef0 6586 00a3 c4f0 6587  ..e.....e.....e.
00000070: 0060 0181 0080 7080 14a3 e6f0 1ef0 6581  .`....p.......e.
00000080: 0080 6080 1381 50a3 c6f1 1ef0 55a3 c5f0  ..`...P.....U...
00000090: 6585 00a3 c5f0 65a3 c6f0 1ef0 6581 50a3  e.....e.....e.P.

In case you haven’t looked at a hex dump before: On the far left is the address or position. In the middle we can see the actual values. On the far right is sort of a “best effort” view of the values as text. Not all values represent valid Ascii characters, so many of the values you see on the right are dots or “not ascii”.

Nothing magic here. A program is essentially a carefully selected series of numbers or opcodes. In order to load a program into our emulator, we need to load a file from storage, then copy it into memory in its right place. Since programs should be loaded at 0x200, make sure that the first byte of your program goes at 0x200, then 0x201 and so forth.

And with that our program is loaded and ready for reading and executing.

Executing Chip-8 Programs

Now that we have a program loaded into our emulated memory, we can start running it. At a high level, the system just does the following in a loop:

while (true) {
    instruction = read_instruction(PC)
    PC += 2
    switch (instruction) {
        case Instruction_1:
            // execute instruction
            break;
        case Instruction_2:
            // ...
            break;
    }
}

The system keeps track of the current program position with a register called PC, short for program counter. A register is a piece of storage that exists within the system. On real hardware, registers are much faster to access than memory, and so most instructions typically do their main work on registers.

PC is a 16 bit register, since it needs to be able to represent the largest position within our memory. When you update PC, you will likely also want to check that it points to a valid position in your emulated memory.

Opcode fetching

Chip-8 has a byte-addressable memory. This means memory addresses tell us which byte into memory needs to be read. Opcodes are represented by 16 bit values, so reading an instruction will involve reading two byts.

To reconstruct a 16 bit opcode from two 8 bit values, there are technically two ways to do it. To take an analogy from decimal (base 10) numbers, we could write 12 as 1,2 or 2,1. Do we put the larger value positions first, or the smaller value positions?

The choice is mostly arbitrary, but you need to make sure you follow the right convention otherwise you will interpret the numbers incorrectly. Chip-8 is what we call a “big endian” system, which means the larger values are written first.

To read a 16 bit big-endian value from our byte addressed memory, the logic is essentially:

a = memory[pc]
b = memory[pc+1]
opcode = (a << 8) | b
# or assuming 8 bit a and b, equivalent to
# opcode = a*256 + b

Conclusion

And with that we have a basic outline for our Chip-8 emulator. Next we’ll start implementing some instructions, and adding other features to our emulator as we need them.

See you on the next one.

⭅ Previous (Intro)

Next (Adding instructions) ⭆