Starting with this article, you will see tons of C# code. Here, I treat the main concepts and high-level implementation details of the Z80 CPU emulation.

Note: You may ask, why I have chosen the C# programming language—and why not another, e.g., C++. I have a short and a long answer. The short answer is this: I’ve been working with .NET since 2000, and I’m a rabid fan of the framework and the C# programming language. I will share the longer answer as a separate blog post in the future.

In the previous post, I already treated the fundamentals of the Z80 CPU, those that were essential when I designed the emulation.

Design and Implementation Principles Used

I graduated as a software engineer, back in 1992, and participated over 50 software development projects in almost every role, excluding sales and marketing related positions. In the recent years I’ve been working as an agile coach and architect, still very close to software construction.

When I started SpectNetIde, I decided to use my favorite software design principles, namely S.O.L.I.D., and K.I.S.S. I could tell a lot about them, but I’m sure, you know them too—or if not, you can find the info on the internet in a minute. To summarize the value of these principles, I’d say, they help to design and implement software with automatic testing in mind.

Devices

Although this post is about Z80 CPU emulation, the long-term objective is a ZX Spectrum IDE, which is a combination of a ZX Spectrum Emulator and a set of Development Tools.

Keeping this thought i mind, a ZX Spectrum emulator is a cohesive set of devices working together. Such a device is the Z80 CPU, the memory, the keyboard, the video display, the tape, and so on. So, one of the most important abstraction is IDevice:

As you can see, IDevice is a simple concept: you can Reset() it.

Note: In the SpectNetIde source code, you will find a lot of comments. In the blog post, I omit most of the comments for the sake of brevity. Whenever it has value, I include the namespaces of types, as they help you to lookup the corresponding source code file.

The Z80 CPU as a Device

Z80 is a more versatile device, as the definition of IZ80Cpu shows it:

The CPU is a state machine. Several properties (such as Registers, StateFlags, and the others) define its current state. When the CPU executes an instruction (this is the responsibility of ExecuteCpuCycle()), the current state changes accordingly.

IZ80Cpu derives from an interface, IClockBoundDevice:

IClockBoundDevice represents a general device that works with a clock signal. Its Tacts properties show the number of clock cycles spent since the system started.

The State of the CPU

As a state machine, the Z80 needs to store its current state vector that is composed of registers, internal state flags of the CPU, and a few other attributes. Some instructions read and write the memory, transfer data between the CPU and I/O devices. You can take them into account as a part of the CPU’s state, too.

Registers and Flags

As you already learned, the 8-bit registers of Z80 can be paired into 16-bit registers, for example, B and C together give BC. Because of performance reason, I use StructLayout, and FieldOffset attributes to define the data structure for Z80 registers:

The [StructLayout(LayoutKind.Explicit)] annotation of the Registers class takes care that we can explicitly control the precise position of each member of the class in unmanaged memory. As you see from the listing, I decorated all fields with the FieldOffset attribute to indicate the position of that field within Registers.

This is how 16-bit registers and their constituting 8-bit pairs are mapped together:

The B and C fields take the locations at offset 3 and 2, respectively, so they precisely overlay with BC. When I assign a value to BC, it affects the memory area of B and C, and thus immediately changes the value of these 8-bit registers, and vice-versa.

Note: StructLayout and FieldOffset together can help to implement the union construct of C/C++.

You can see a register you probably have not heard about yet, its WZ. Well, this is an internal register of the Z80 CPU that helps to put a 16-bit register’s value onto the address bus. The only way to load the contents of these 16-bit registers is via the data bus. Two transfers will be necessary along the data bus to transfer 16 bits, and this is where WZ helps. You cannot reach the contents of this internal register programmatically.

The registers class also has read-only properties to obtain field values. These accessors utilize the FlagSetMask type to get the bits to mask out the individual flags:

Note: Initially I used an enum type, but later refactored it to byte constants. This approach made my flag-related operations shorter as I could avoid unnecessary type casts.

Similarly to FlagSetMask, I have a collection of byte constants that are more useful when setting or resetting individual flags:

Signal and Interrupt Status

In each execution cycle, the Z80 checks signals. I created an enum type, Z80StateFlags, to represent them:

As you see, Z80StateFlags contains value members (with the Inv prefix) that can mask out the individual flag values. The benefit of this way is that I can keep the states of all signals in a variable of Z80StateFlags and making a simple condition (state == 0, where state is a Z80StateFlags) to check if any of the signals is set.

Note: Int, Nmi, and Reset represent the CPU signals with the same names. Halted is an output signal that the CPU uses to tell the external devices it is in HALTed state.

Earlier, you saw that the IZ80Cpu interface defines a few members related to interrupt state:

IFF1 and IFF2 are two flip-flops (flags) within the Z80. The CPU uses IFF1 to check if a maskable interrupt is enabled. When the CPU starts, this flag is set to zero (disabled). The EI instruction sets it to 1. The purpose of IFF2 is to save the status of IFF1 when a non-maskable interrupt occurs. When a non-maskable interrupt is accepted, IFF1 resets to prevent further interrupts until re-enabled by the program. Therefore, after a non-maskable interrupt is accepted, maskable interrupts are disabled, but the previous state of IFF1 is saved so that the complete state of the CPU just before the non-maskable interrupt can be restored when the interrupt routine completes.

The InterruptMode property retrieves the current mode set by any of the IM 0, IM 1, or IM 2 instructions.

The CPU samples the INT signal with the rising edge of the final clock at the end of any instruction. Even if the maskable interrupt is enabled (IFF1 is true), the normal flow of execution cannot be immediately interrupted. The implementation of the CPU uses the IsInterruptBlocked property to handle this situation.

Clock Cycles

Timing is everything. The Spectrum emulator could not work without it—precisely timed graphics effects would fail, so your favorite games would not run the way you expect.

The IZ80Cpu interface uses it ancestor’s (IClockBoundDevice) member, Tacts to manage the time spent since the system had started:

Tacts counts the clock cycles. Its 64 bits are long enough to count the clock beats until the end of times. You can quickly check this statement: count the $7fff_ffff_ffff_ffff value with 3.500.000, the clock frequency, then with 86.400, the seconds in a day. The result will show the number or days Tacts can be used without overflow. It will be a surprisingly high number!

There’s no reason to measure the CPU time in absolute units (let’s say, in nanoseconds). It would just make the things more complicated. Of course, when emulating real-time behavior, the time indicated by Tacts should be converted to absolute time. As you will learn it from a future article, you need this conversion about 50 or 60 times in a single second. These numbers are almost nothing compared to the 3.500.000 clock cycles per second.

Opcode Processing State

In the previous post, you learned that Z80 has instructions with one, two, or three-byte opcodes.

In a single CPU cycle (M1 machine cycle) the CPU reads only one byte from the program. I use the values of two enum types (OpPrefixMode, and OpIndexMode) to keep up the current opcode processing state:

The Memory and I/O Devices

As I mentioned, the memory and the I/O devices are the part of the CPU’s state. The result of operations may depend on the values read from the memory or a device. Similarly, calculated values are persisted in the memory or sent to devices.

Just like the Z80 CPU, the memory and I/O port are devices, and thus implement the IDevice interface. I represent these components with the IMemoryDevice and IPortDevice interfaces, respectively.

Here, I show you only those interface methods that the CPU uses. This is IMemoryDevice:

I guess this definition is straightforward. The only thing you may not understand at this moment is the noContention argument of Read(). Right now, just take it as if it were not there. In a future article—not very long time from now—I will explain it with all other aspects of memory and I/O contention.

The definition of IPortDevice is very similar to IMemoryDevice:

Now, you know how the state of the Z80 is stored. Let’s see how the emulation works!

CPU Implementation

I closed all the functionality of the Z80 CPU into the Z80Cpu class, which has this definition:

You already know that IZ80Cpu is an abstraction of the CPU. However, you can see that the Z80Cpu class implements another interface, IZ80CpuTestSupport. But why?

The S in S.O.L.I.D stands for Single Responsibility. IZ80CpuTestSupport defines those methods that are not part of the CPU’s abstraction, but implementing them helps in testing if the implementation works correctly:

With the methods of IZ80CpuTestSupport, I can easily disturb the normal operation of the CPU; for example, I can modify its clock, or externally block interrupts. If I’d add these operations to IZ80Cpu, I could make a programming error, since through an IZ80Cpu instance I could change the clock. Putting it into a separate interface, I can avoid these issues. Of course, in the concrete implementation of the Spectrum emulator, I must use a reference to an IZ80Cpu object and not to a Z80Cpu instance to prevent such a mistake.

Multiple Files

I defined Z80Cpu as a partial class, because I implemented it in multiple files:

File Role

Z80Cpu.cs

Core routines
Z80AluHelpers.cs Helper methods I use in ALU operations
Z80Operations.cs The implementation of standard Z80 instructions (with no opcode prefix)
Z80ExtendedOperations.cs The implementation of extended Z80 instructions (with $ED opcode prefix)
Z80BitOperations.cs The implementation of Z80 bit manipulation instructions (with $CB opcode prefix)
Z80IndexedOperations.cs The implementation of indexed Z80 instructions (with $DD or $FD opcode prefix)
Z80IndexedBitOperations.cs The implementation of indexed Z80 bit manipulation instructions (with $DD, $CB, or $FD, $CB opcode prefixes)
Z80Debug.cs The part of the CPU implementation used by the debugger tooling in SpectNetIde.

Rock Around the Clock

Earlier you saw that the Tacts property of the CPU is crucial in measuring the number of clock cycles. The Z80Cpu class contains several helpers to make clocking fast and smooth in the code:

As you see, I added methods that emulate the time passes (by means of increasing clock cycle counts). Because in the implementation of concrete Z80 instructions it is a typical operation to use delays with 1, 2, … 7 clock cycles, I created named methods for them. To make them as fast as possible, I decorated them with the [MethodImpl(MethodImplOptions.AggressiveInlining)] attribute to let the JIT-compiler create inline code when invoking them.

Note: Code inlining of means that the compiler inserts the entire function body into the code wherever you invoke the particular function—instead merely creating the invocation code. In C++, creating inline code is easy. In .NET, it is the task of the JIT compiler. With the MethodImpl attribute, you can give a hint to the JIT-compiler to inline the code, but you cannot force it.

The Main Execution Cycle

The CPU (as a state machine) works continuously executing a loop, its main execution cycle. Here is how I implement it:

The execution cycle starts with checking whether the CPU receives any new active signals (INT, NMI, or RESET). If it is so—ProcessCpuSignals() returns true—the CPU processed a signal, and thus this execution cycle completes.

Otherwise, the M1 machine cycle starts:

I use the MaskableInterruptModeEntered flag in the integrated debugger so that I can step over Z80 statements that are the part of the currently running maskable interrupt routine. It does not play any role in the Z80 emulation.

The first real task is reading the subsequent opcode from the memory and incrementing the Program Counter. These operations consume the first three clock cycles of M1. Then, as you learned in the previous article, at the end of M1, the CPU refreshes the subsequent memory page (according to R). This is how it happens:

Altogether, the M1 cycle consumes four clock cycles (ClockP3() + ClockP1()).

The other parts of the ExecuteCpuCycle() method manage the opcodes and prefixes. When the prefixes and opcodes form a full operation to carry on, one of these three methods is called according to the prefix: ProcessStandardOrIndexedOperations(), ProcessCBPrefixedOperations(), or ProcessEDOperations().

The CPU takes care that an interrupt cannot suspend the normal operation of the CPU while the opcode bytes are not entirely collected:

Processing CPU Signals

Every machine cycle starts with examining whether there is a signal the CPU can process. As its name suggests, the ProcessCpuSignal() method carries out this procedure. Its logic is straightforward:

However, there is one thing I have not mentioned yet. The CPU can be halted with the HALT instructions. It this state, the CPU executes NOP (no operation) instructions silently until a maskable interrupt is accepted or the CPU is reset. During that time (this is what the M1 machine cycle does), the CPU still takes care of refreshing the memory. When the CPU receives a non-maskable interrupt, this is what it does:

Provided the processor is in halted mode, it steps to the next instruction that follows HALT and retrieves from this mode. It saves the IFF1 flag to IFF2 to preserve the IFF1 value while the interrupt routine completes. Sets IFF1 to false to disable any further interrupt during that time.

Then, saves the current value of PC to the stack, and jumps to the NMI routine address at $0066.

When a maskable interrupt is accepted, the logic is similar with some additional tasks:

Here, we handle the halted state and saving the current PC address exactly as earlier. The switch statement handles the three interrupt modes (IM 0, IM 1, and IM 2), respectively. IM 1 (case 1) is the simplest; it merely sets the execution address to $0038. The IM 0 and IM 2 cases are a bit trickier. Both read data from the peripheral device that has raised the interrupt signal. If there is no such device, or it does not put any value to the data bus, the CPU sees a $FF value. ZX Spectrum with no special devices attached works precisely this way.

IM 0 reads one byte from the device and executes the corresponding instruction. The $FF code is the RST $38 instruction, and it calls the routine at the $0038 address. Thus, our code handles IM 0 and IM 1 the same way.

As you remember, IM 2 uses I as the higher-order byte and the value read from the device as the lower-order byte to create a 16-bit address and then uses this vector to read the interrupt handler’s routine address. In the switch statement, the default case handles IM 2. It assumes that the device did not respond with any data (and so the CPU sees $FF), and calculates the routine address accordingly.

Note: This method sets the MaskableInterruptModeEntered flag to true to tell the debugging tool that we are executed into the maskable interrupt routine. This setting has nothing to do with Z80 emulation.

Executing Instructions

When the CPU has the opcode for an entire operation, it calls one of these methods according to the operation prefix:

  • No prefix, $DD or $FD: ProcessStandardOrIndexedOperations()
  • $ED prefix: ProcessEDOperations()
  • $CB prefix; $DD or $FD followed by $CB: ProcessCBPrefixedOperations()

Each method uses a jump table with addresses of methods that process the operation with the opcode that matches the entry’s index. This is how ProcessEDOperations() works:

The constructor of Z80Cpu invokes the methods that initialize the jump tables (in the listing, it is the InitializeExtendedOpsExecutionTable() method).

A null entry in the jump table is an equivalent operation with the NOP instruction. It means that the CPU does not change its value. Wherever there is an operation method, that method executes the action that represents the associated instruction. For example, this is the action method for the IM 0, IM 1, and IM 2 operations:

The binary opcodes (after the $DE prefix) for the IM 0, IM 1 and IM 2 operations are these:

  • x1x0_0110
  • x1x0_1110
  • x1x1_0110

Bit 4 and Bit 3 of the opcode define the value for the interrupt mode: 00: IM 0; 01: undefined (we set it to 0); 10: IM 1; 11: IM 2.

Indexed Instructions

Processing an indexed instruction add some twist to the story. They use separate jump tables, as this code snippet shows:

With the $DD or $FD prefix, the ProcessStandardOrIndexedOperations() method uses the _indexedOperations table.

One entry in that table is INC_IX(). Although the method name suggests it works with IX, it is responsible for processing the index register determined by the current prefix:

The helper methods, SetIndexReg() and GetIndexReg() take care of using the appropriate register:

Next: Instruction Details

By now, you know the fundamentals of the Z80 CPU emulation. You understand state management with registers and other state vector elements. You also have an overview of the execution cycle, the details of processing interrupt requests.

Some details that can be best covered by treating the details of individual Z80 instructions. In the next post, you will read about such nitty-gritty things.

4 Comments

Leave a comment

Your email address will not be published. Required fields are marked *