⭅ Previous (Background graphics.) Next (Controller) ⭆

NES System Timing (CPU + PPU + APU)

We’ve now covered the high level details of the graphical system on the NES. Now we’ll see how the graphics chip, the PPU, is synchronized with the CPU. This is key to avoid graphical artifacts, and also gives us a better idea of how the system works.

The system clock

Back when we looked at the CPU, we saw that the CPU splits its work up into many little steps, or instructions. The CPU knows when to run the next step using an external signal called a “clock”.

A clock is a simple on-off-on signal, which sets the rhythm used in the system. Typically, when the clock goes from off to on, all the connected devices take this as a signal to do one step of work.

Each combined “on-off” pattern, or one period, is also called a cycle. It is common to talk about durations of events in terms of the number of cycles required to complete them.

Both the CPU and the PPU (and also the sound chip, the APU), are connected to the same clock signal. This is how they can be expected to run in step with one another.

NTSC PPU Timing

In the previous articles on sprites and background, we discussed the functionality provided by the PPU, but not the details of how it accomplishes it.

The PPU can be thought of as running a fixed program. It is responsible for reading the required values from memory so that it can figure out what color to put on screen. It does this in sync with the television, outputting each color as the beam from a CRT would be over each pixel of the screen.

The PPU program can then be understood in terms of the television signal timing. In this article I’ll look at the NTSC signal used in North America. Other regions use PAL, which is similar but varies slightly in the timing of various phases.

NTSC Overview

The NES screen is 256 x 240 pixels. Since NTSC is designed to accomodate televisions with a moving beam, space is left on the side of each row, where the beam is off but repositioning to the start of the next row. There is also a gap at the bottom of the last row, which allows the beam to get back to the upper left corner of the screen.

The red here corresponds to when pixels are actually being drawn. ntsc timing diagram (Source: nesdev)

For timing purposes, the PPU can be thought of as drawing a screen with 283 x 262 pixels. Though only for a limited window of 256x240 pixels is the PPU outputting a color value for our screen. The two rows at the bottom, when the NES is not drawing, is when the CPU can safely modify graphical data. The PPU isn’t reading anything here, and so there is no risk of graphical corruption.

How does the CPU know when the PPU is within this non-drawing or “blanking” region? Fortunately, the PPU can inform the CPU of this through an interrupt.

PPU Timing interrupts

One way the CPU can detect the blanking period is through the vblank NMI. As the name implies, this is an interrupt that will be triggered when the PPU enters the blanking period after drawing one screen.

The CPU can enable this through setting a bit in the PPUCTRL mmio register at at 0x2000. If enabled, the CPU will start executing code at the address specified by the NMI reset vector. This allows programmers to prepare some code at the end of each frame. This often sets up graphics for the next frame, and writes a notification somewhere so that the CPU knows it is free to start work on simulating the next frame afterwards.

While the NES is in the blanking period, the CPU can also observe that the vblank bit within the PPUSTATUS (0x2002) register will be set. NES software can depend on either or both of these, so emulation must provide this functionality.

Emulating the Frame Timing

Now that we know how the CPU and PPU handle timing, we can discuss how this might be implemented in an emulator. Here there are two primary approaches. A fast but less accurate high-level emulation, or a slower yet more accurate cycle-accurate approach.

High level video timing.

High level video emulation is simpler, so that is what will be addressed first. In this approach, we can let the CPU run until it would be time for the PPU to draw the frame. Then the PPU can output the entire frame at once, and set vblank and fire the interrupt if enabled. Then the CPU runs for a bit while the vblank flag is set. Finally, vblank is set back to 0 and it repeats.

Or roughly:

// Timing Math
const CPU_FREQ : f32 = 1789773.0;  // 1.789 Mhz
const FRAME_RATE : f32 = 60.0;
const CYCLES_PER_FRAME : f32 = CPU_FREQ / FRAME_RATE;
const CPU_AVG_CYCLES : f32 = 3.0;
const CPU_INSTR_PER_FRAME : usize = (CYCLES_PER_FRAME / CPU_AVG_CYCLES) as usize;
const CPU_INSTR_PER_FRAME_ACTIVE : usize = CPU_INSTR_PER_FRAME * 240 / 262;
const CPU_INSTR_PER_FRAME_VBLANK : usize = CPU_INSTR_PER_FRAME - CPU_INSTR_PER_FRAME_ACTIVE;

fn nes_frame() {
    ppu_set_vblank(false);
    for i in 0 .. CPU_INSTR_PER_FRAME_ACTIVE {
        cpu_run_instruction();
    }
    ppu_draw_frame();
    ppu_set_vblank(true);
    ppu_maybe_nmi(cpu);
    for i in 0 .. CPU_INSTR_PER_FRAME_VBLANK {
        cpu_run_instruction();
    }
}
    

For the emulated CPU to think that a PPU is running, it needs to have the NMI triggered at the correct frequency. Real NES hardware draws a frame 60.1 times per second. Since typical computer displays update 60 times per second, many emulators instead run at 60 fps at the cost of running slightly slower than real hardware.

The calculations below will be for a 60fps NES emulator. 60 fps leaves 16.666 ms per frame. A frame is effectively 262 lines, and vblank will be set for lines 241 through 262.

Remember that the NES main clock is running at CPU_FREQ=1789773.0 hz, one cycle would be 1/CLOCK_FREQ. CPU instructions take a varying number of cycles, but for a simple approximation, all instructions can be assumed to take the average of 3 cycles to complete. We can then calculate how many instructions would be completed in a frame, as well as how many instructions occur during the active vs vblank periods.

Though the timing here is rough, it is actually sufficient for emulating quite a bit of NES software. There are a few other sources of timing that this approximation doesn’t cover, such as:

For software that relies on more exact timing, there is the option of cycle-accurate emulation.

Cycle Accurate Emulation

Some software for the NES is carefully crafted based on very precise timing. Rather than waiting for the interrupt at the end of a frame, it is possible to write code that executes in lockstep with each horizontal line aka scanline. This allows for some effects that cannot be achieved ordinarily.

For this, all observable operations need to take the same amount of time. In practice, this means modeling the behavior of each key component, including the CPU, PPU, and APU(audio) down to each cycle. This ensures that any timing effects they might have relative to eachother are preserved.

Accurately capturing the behavior of each of these chips is quite difficult. However, communities like NESdev have collected the results of a variety of experiments over time on the wiki.

With the chiplab, I plan to make it easier to document the behavior and build a executable models of these chips. If this sounds interesting to you, consider joining the discord. If the NES chips are not part of the lab at the time of reading, they will be soon.

Conclusion

And that concludes our investigation of the NES graphics for now. Next up we’ll add some interactivity to our emulator, and learn how the controller works.

⭅ Previous (Background graphics.) Next (Controller) ⭆

We publish about 1 post a week discussing emulation and retro systems. Join our email list to get notified when a new post is available. You can unsubscribe at any time.