⭅ Previous (6502 Available) | Next (Adding 65816 Hardware) ⭆ |
Last time I got a 6502 CPU fully supported in the Chiplab. You can use an in-browser assembler to write your code, then upload them to run your programs against a real 6502.
Now that the chiplab is fully working for a simple chip, its time to start adding some more interesting systems. Next up is the 2A03 CPU from the NES. Though this presents some new challenges which need to be addressed.
The first chip added is a CMOS version of the 6502. This is produced with a newer process than the original 6502which happened to use NMOS. The specifics of these two processes are not too important for now. CMOS will refer to this newer 6502, and NMOS will refer to the older original design.
One nice simplification with the CMOS version of this chip is that it supports “DC” clock frequency. Though the name is a bit of a misnomer, this means that the chip can be run with an arbitrarily slow clock signal, and the behavior will still be correct.
This is not true of digital chips in general, and was not the case for the NMOS 6502 as we’ll see in a bit.
With a CMOS 6502 already working in the lab, you may be wondering why we should care about the older NMOS version of the same chip. The 6502 was a very popular design and instruction set at the time. Since programmers were writing primarily in assembly language, using this chip meant programmers already familiar with 6502 assembly could produce software for your system. This lead to the 6502 being used in many systems, sometimes even embedded in other chips.
The NES is one such system that used the 6502. It embedded a minor variation of the 6502 with some custom sound hardware. The resulting chip was the 2A03 produced by Ricoh. And due to the time period, this behaves like the NMOS 6502, and doesn’t support a DC clock frequency.
So while the CMOS 6502 gives us a way to explore the 6502 chip in detail, adding an NMOS 2A03 would let us explore the behavior of the custom sound hardware.
Since the NES 2A03 embeds what is very nearly a 6502, the chip exposes nearly all the same pins as the 6502 chip. Though many of the pins are in different positions relative to the 6502. So in theory, we could simply change the wiring of our breadboard 6502 prototype board to account for new positions, and run programs as we did for the 6502. Lets see what happens.
I have a “looptest” program which I had written to test all the pieces for the 6502 chiplab. This program writes values 0x10 to 0x00 to address 0xCAFE. If the chip is functioning properly, this behavior should be easy to observe on the output pins of the chip.
However, when attempting to run this program against the 2A03, the chip doesn’t even get as far as reading the reset vector to jump to its starting location.
So why doesn’t this chip work at lower speeds? In researching this chip online, I’ve found that many chips use “dynamic logic”, where there are internal latches to store temporary values based on capacitance. These act like leaky buckets, holding enough charge to maintain their required voltage and thus value only for short amounts of time. These then are recharged again on a subsequent clock cycle. This is similar to how DRAM memory refresh works.
When operated under their designed clock frequency, these latches hold their value just as required. But at much lower frequencies, these latches can’t hold their charge long enough for the next clock cycle.
How fast would we need to run the 2A03 in order to operate as desired? Though 2A03 datasheets are unavailable, I’ve found an older datasheet for the NMOS 6502. It reports a minimum frequency of 1 MHz. The 2A03 supposedly runs its internal 6502 at 1/12 the provided clock. This would then require providing a 12 MHz clock to the 2A03. Though the other inputs could be updated to align with this slower 1MHz cadence, since that is when the 6502 clock would actually update.
While the current chiplab setup isn’t fast enough, perhaps it would be fast enough after some software changes. Lets look at some quick estimates for the current ATMega based system, then for a microcontroller based system in general.
The microcontroller runs at 16 MHz, and nearly all instructions can be completed in 1 cycle. (setbit, clearbit take 2, since they’re like a read+write each). This is nice and simple, since we can estimate speed by counting instructions.
Manipulating a gpio, once in the write mode (output vs input), is a single write. Toggling the clock is then two writes, (on, off). So we should be able to toggle the clock at 8 mhz. So far so good.
Now lets consider the other logic. Since the microcontroller doesn’t have memory for the entire target program (16K for the 6502 in this example), these bytes are streamed from the host over serial. At 115200 baud (symbols or roughly bits/sec), about 14400 bytes / sec. The host is going to need to read at least two bytes (address bus), possibly more (data bus + status pins). And each cycle, the host would need to write back the corresponding data bus value (1 byte).
3 bytes per step at 14400 bytes / sec -> 4800 round steps / sec, far below the 1MHz target. Though some serial interfaces can go faster than 115200 baud, the highest speeds are about 8x this standard limit. Still not fast enough.
This current design relies on streaming the target program via serial. Another approach could be to actually attach a ROM chip to the system. If the ROM were programmed once at the start of each program, the host and thus the serial connection could be removed from the critical path. The microcontroller would still need to read the value of each of the pins off the target chip, which would become the new bottleneck.
For debugging digital protocols, there is a tool called a “logic analyzer”, specialized for observing and recording digital signals. Since logic analyzers typically only record, and do not stimulate the target, this would need to be combined with the preprogrammed ROM mentioned previously.
Though logic analyzers can easily record at a high rate, they tend to be pretty expensive. And while they can typically record at frequencies between 20 MHz and 1 Ghz, they typically only support around 10 pins or “channels”. Supporting even a single chip from the 80s would thus require multiple logic analyzers and cost hundreds.
So I’ve started building something custom. This article is already longer than planned, so I’ll talk more about the design in a future post.
⭅ Previous (6502 Available) | Next (Adding 65816 Hardware) ⭆ |