Assemblers, Memory Maps, and Madness

My VIC-20 test program kept corrupting itself. The assembler output was correct—I’d verified every opcode against the reference table—but the program would run for a few frames, then scribble over its own instructions. I spent three days adding debug logging to the CPU emulator before I realized the bug wasn’t in the emulator at all. It was in the address I’d chosen for the program.

The KERNAL loader uses certain zero-page locations as scratch space during the load process. My program also used those locations. The loader ran first, my initialization ran second, and by the time the main loop started, my “initialized” variables had already been overwritten. The assembler had done exactly what I asked. I just hadn’t asked for the right thing.

That debugging session changed how I think about assemblers. They’re not translators; they’re address calculators that happen to emit bytes. Every decision the assembler makes—which opcode, how many bytes, where to place data—depends on addresses it has already committed to. Get the address model wrong, and the bytes are perfect but the program is broken.

The Ambiguity at the Heart of 6502

The core problem: the 6502 lets the same textual form produce different machine code.

LDA $42 loads from zero-page address $42. It assembles to two bytes: A5 42. But LDA $0042 loads from the same address using absolute addressing—three bytes: AD 42 00. The zero-page version is faster, smaller, and uses a different opcode. The assembler has to choose between them at parse time, before it can even look up the opcode.

This is the smallest behaviour that shapes the parser. The parser must commit to an addressing mode before anything else can happen.

My first parser was a hand-written recursive descent mess. Every edge case spawned another if branch:

($40,X) is indexed indirect, ($40),Y is indirect indexed—completely different operations
ASL with no operand is accumulator mode, but ASL A is also accumulator mode
Branch targets can be labels or numeric offsets
Comments can appear anywhere

After 500 lines of tangled conditions, I still couldn’t handle all the edge cases. So I rewrote the parser using Pest, a PEG grammar library for Rust.

// Addressing modes (order matters - most specific first)
addressing_mode = {
    indexed_indirect |    // ($xx,X)
    indirect_indexed |    // ($xx),Y
    indirect |            // ($xxxx)
    immediate |           // #$xx or #value
    number_with_x |       // $xx,X or $xxxx,X
    number_with_y |       // $xx,Y or $xxxx,Y
    number_only |         // $xx or $xxxx
    accumulator |         // A
    label_ref             // label (branches and jumps)
}

The grammar is 73 lines. The key insight is that PEG alternatives are tried top-to-bottom, so the most specific forms must come first. indexed_indirect has to precede indirect, or ($40,X) gets misrecognized as a malformed indirect. The ordering makes the edge cases explicit in a way that nested if statements never could.

Relative Branches: Where the Calculator Shows Through

Branch instructions reveal the assembler’s true nature as an address calculator.

6502 branches use relative addressing: the operand is a signed offset from the next instruction, not from the branch itself. That means the assembler has to know how many bytes it has emitted so far, add the size of the branch instruction (always 2 bytes), and calculate the offset to the target.

// Calculate relative offset: target - (current_address + 2)
// The +2 accounts for the two-byte branch instruction
let offset = (target_addr as i32) - ((address as i32) + 2);

if offset < -128 || offset > 127 {
    return Err(format!(
        "Line {}: Branch target '{}' is out of range ({} bytes, must be -128 to +127)",
        line, label, offset
    ));
}

That + 2 is the hard constraint. Get it wrong and forward branches land two bytes into garbage—an error that looks “almost right” until you trace through the actual bytes.

The test suite protects this invariant explicitly:

// At $0200: INX      -> E8
// At $0201: BNE loop -> D0 FD
// Target $0200, so offset = $0200 - $0203 = -3 = $FD
assemble_and_check(
    source,
    &[
        0xE8,       // INX
        0xD0, 0xFD, // BNE -3
    ],
);

The test isn’t checking “does the assembler emit a BNE instruction.” It’s checking “given this address model, are the bytes inevitable?” That’s the assembler’s core promise.

Z80: Same Model, Different Surface

The Z80 assembler has simpler addressing modes but the same underlying structure: normalize input, calculate addresses, emit bytes.

One quirk is directive syntax. Some Z80 source files use .ORG, others use ORG without the dot. Both are common in the wild. The parser normalizes early and lets the rest of the assembler stay simple:

// Normalize: ensure directive name has dot prefix for consistent handling
if !name.starts_with('.') {
    name = format!(".{}", name);
}

This is another example of the smallest behaviour pattern. The truth is “directives are the same regardless of prefix,” so I normalized at the boundary and eliminated the ambiguity from everything downstream.

Memory Maps: The Model You Forgot to Declare

The assembler can only reason about addresses it’s told to assume. That’s what .ORG really means: not “place these bytes here,” but “assume these bytes will execute here.” If the loader doesn’t match that assumption, the bytecode is perfect but the program is wrong.

The VIC-20 bug from the opening was a quiet version of this mismatch. My program lived at $0200, the KERNAL loader ran first, and my zero-page variables were already mutated before the first instruction executed. Moving the program to $0300 fixed everything:

* = $0300

The assembler was never broken. My model of the memory map was.

This is why targeting multiple machines complicates the assembler. The VIC-20 and C64 share a 6502, but their memory maps differ:

C64 screen RAM: $0400-$07FF
C64 colour RAM: $D800-$DBFF
VIC-20 screen RAM: $1000 (with expansion)
VIC-20 colour RAM: $9600

The assembler doesn’t enforce those addresses, but the programs you assemble absolutely depend on them. A single “6502 assembler” can emit correct bytes for any of these machines, but it can’t tell you when your memory model is wrong. That responsibility stays with the programmer—or with a machine-specific configuration layer that the assembler consults.

Testing: The Bytes, Not the Intent

The only test that matters is whether the bytes match a known-good assembler.

I verified my opcode table against ca65, dasm, and the 6502.org reference. The test suite assembles snippets and compares raw bytes, with comments pinning the address math:

// At $0200: LDX #$05 -> A2 05
// At $0202: DEX      -> CA
// At $0203: BNE loop -> D0 FD (back to $0202, offset = -3)
// At $0205: BEQ done -> F0 01 (forward to $0208, offset = +1)
// At $0207: NOP      -> EA
// At $0208: RTS      -> 60
assemble_and_check(
    source,
    &[
        0xA2, 0x05, // LDX #$05
        0xCA,       // DEX
        0xD0, 0xFD, // BNE -3 (back to loop)
        0xF0, 0x01, // BEQ +1 (forward to done)
        0xEA,       // NOP
        0x60,       // RTS
    ],
);

That test does more than verify correctness. It documents the address model and makes it impossible to silently break the relative offset calculation.

The Assembler as First Emulator

A CPU emulator interprets bytes. An assembler decides what bytes exist.

That makes the assembler the first emulator in the chain. It has to model the same constraints the CPU will enforce: instruction sizes, addressing mode encodings, the distinction between zero-page and absolute. If the assembler’s model diverges from the CPU’s reality, the program will fail in ways that look like CPU bugs but aren’t.

Building these assemblers taught me more about 6502 addressing than any amount of reading could have. The bugs I found weren’t in opcode tables—those are well-documented. The bugs were in my assumptions about where code lives, how loaders interact with programs, and what “correct” means when the address model is wrong.

The assembler doesn’t just translate syntax to bytes. It encodes a contract between the programmer and the machine. Break that contract, and the bytes are perfect but the program is broken.