Deep Dive: PETSCII and the VIC-II Text Adapter

The first time the terminal printed **** COMMODORE 64 BASIC V2 ****, I sat there staring at it for probably a full minute. I’m not sure why it hit me so hard—those characters came from screen RAM at $0400, passed through a PETSCII lookup table, and emerged as Unicode glyphs wrapped in ANSI escape codes. No pixels, no framebuffer, no canvas. Just bytes becoming text. And in that moment I realized what I was actually building wasn’t a display emulator at all. It was a translation layer between two text systems separated by forty years, and I honestly hadn’t set out to build that.

If you’re here hoping for a faithful VIC-II emulation, I should warn you upfront: this isn’t that. What follows is how I made something much narrower actually work—the character mapping, the colour approximation, and all the deliberate constraints that keep the adapter honest about the things it cannot do.

Why Not Just Emulate the VIC-II Properly?

I tried that first, actually. (Of course I did.) My initial approach was going to intercept every VIC-II register write, maintain a proper raster counter, and render actual bitmap frames. That lasted about two days before I realized I was building a graphics emulator for a terminal that has no concept of pixels.

A terminal doesn’t know what a raster interrupt is. It can’t do sprite collision. It has no idea what $D011 means. So I backed up and asked a simpler question: what’s the minimum I need to preserve for 6502 programs to be visible in a text terminal?

The answer, it turns out, is embarrassingly small. The adapter just needs to read video memory, translate bytes to glyphs, translate colours to ANSI codes, and emit the result. No scanlines. No character ROMs. No raster timing. A pure function from memory state to terminal output. This constraint felt like giving up at first, but it’s actually what makes the whole thing testable—the 6502 program sees familiar memory addresses, the terminal sees familiar escape sequences, and the adapter just translates between them without inventing state that neither side can verify.

PETSCII: 256 Codes In, Mostly Question Marks Out

PETSCII is Commodore’s character encoding, and it’s… strange. It’s sort of a cousin of ASCII, but where ASCII has control codes, PETSCII has graphics characters—hearts, diamonds, diagonal lines, that sort of thing. I spent a while looking at PETSCII reference charts trying to figure out how to map everything to Unicode, and the honest answer is: you can’t. Not completely.

Here’s what I ended up with for the box-drawing characters that actually matter:

// web/src/video/c64-text-adapter.ts
map[0x60] = '━'; // Horizontal line
map[0x7b] = '┼'; // Cross
map[0x7c] = '│'; // Vertical line
map[0xa0] = '█'; // PETSCII solid block (inverse space)
map[0xc0] = '─'; // Horizontal bar
map[0xe0] = '┌'; // Top-left corner
map[0xee] = '└'; // Bottom-left corner
map[0xf2] = '┐'; // Top-right corner
map[0xfd] = '┘'; // Bottom-right corner

The mapping is intentionally sparse—I’m only handling the characters I actually need. Printable ASCII (0x20–0x7E) maps directly. Lowercase letters live at 0x61–0x7A. Box-drawing characters get their Unicode equivalents. Everything else? Question marks.

That ? placeholder is a conscious choice, not a bug, though I’ll admit it felt like failure the first few times I saw a screen full of them. But here’s the thing: when a C64 program draws a heart or a diagonal line that has no Unicode equivalent, the question mark says “this character exists but I cannot show it” rather than silently dropping it or guessing wrong. I’d rather have honest garbage than misleading output.

Control codes (0x00–0x1F) map to spaces, which is also a bit of a cop-out. On real hardware, these would move the cursor, change colours, toggle reverse video. But the adapter can’t execute them because a terminal render has no cursor state between frames—each render is a complete redraw from memory. I’m not sure if there’s a better approach here. Probably not, given the constraints I set for myself.

The Video Memory Structure (and the Dirty Flag That Saved Everything)

The 6502 core exposes C64 video state through a simple Rust struct. I went through a few iterations here before landing on something that worked:

// cores/mos6502/src/video.rs
pub struct VideoState6502 {
    pub screen_ram: [u8; 1024],
    pub color_ram: [u8; 1024],
    pub border_color: u8,
    pub background_color: u8,
    pub display_enabled: bool,
    pub dirty: bool,
}

The memory map is fixed—screen RAM at $0400–$07FF (1024 bytes for a 40×25 grid, even though you only need 1000 cells), colour RAM at $D800–$DBFF, border and background colours at $D020 and $D021.

But the real hero here is that dirty flag. I added it almost as an afterthought, but it turned out to be critical. Writing to screen RAM, colour RAM, or any VIC-II register sets the flag. The adapter checks it before each render and skips unchanged frames entirely. For a BASIC prompt that just blinks a cursor once per second, this means 59 out of 60 frames are free. (I was initially rendering every frame and wondering why performance was so bad. Sometimes the obvious optimization is the one you forget to try first.)

I later went back and made the dirty flag smarter—the commit message says “10-30% fewer render triggers”—by only setting it when the value actually changes, not just when a write happens. Whether that optimization was worth the added complexity, I’m honestly still not sure.

Rendering: A Nested Loop with Some Fiddly Bits

The render loop itself is boring, which is probably a good sign:

renderToANSI(vterm: VirtualTerminal): void {
  if (!this.dirty) return;

  const bgHex = C64_PALETTE[this.backgroundColor & 0x0f];
  const bgAnsi = hexToAnsiBg(bgHex);

  for (let y = 0; y < this.mode.height; y++) {
    for (let x = 0; x < this.mode.width; x++) {
      const offset = y * this.mode.width + x;
      const petscii = this.screenData[offset];
      const colorIndex = this.colorData[offset] & 0x0f;

      const char = this.petsciiToChar(petscii);
      const fgHex = C64_PALETTE[colorIndex];
      const fgAnsi = hexToAnsi(fgHex);

      vterm.putChar(x, y, char, { fg: fgAnsi, bg: bgAnsi });
    }
  }

  this.dirty = false;
}

For each of 1000 cells: read PETSCII code, look up Unicode glyph, read colour index, look up ANSI code, emit. The background colour applies uniformly because it comes from the VIC-II register, not from per-cell colour RAM.

That & 0x0f mask on the colour index is easy to miss, and I actually did miss it at first. Colour RAM stores 8 bits per cell, but only the low nibble is valid. On real hardware, the high bits are “open bus”—programs that read colour RAM might see garbage in the upper nibble. The mask ensures consistent behaviour regardless of what the 6502 program wrote. I wasted about an hour on weird colour glitches before I figured this out.

Colour Translation: Embracing the Loss

Here’s where things get interesting, and by “interesting” I mean “I spent way too long trying to make this perfect before accepting that perfection was impossible.”

The C64 palette has 16 colours. ANSI terminals also have 16 colours. They are not the same 16. The adapter approximates by analysing each hex colour for brightness and dominant channels:

function hexToAnsi(hex: string): number {
  const r = parseInt(hex.slice(1, 3), 16);
  const g = parseInt(hex.slice(3, 5), 16);
  const b = parseInt(hex.slice(5, 7), 16);

  const brightness = (r + g + b) / 3;
  const bright = brightness > 128;

  const max = Math.max(r, g, b);
  const threshold = max * 0.6;
  const isRed = r >= threshold;
  const isGreen = g >= threshold;
  const isBlue = b >= threshold;

  if (isRed && isGreen && isBlue) return bright ? 97 : 37;
  if (isRed && isGreen) return bright ? 93 : 33;
  // ... and so on
}

This mapping is deterministic but absolutely lossy. C64 cyan (#AAFFEE) and light blue (#0088FF) both have strong blue components, but cyan also has green, so the algorithm maps them to different ANSI codes. Orange (#DD8855) has red and green, so it becomes yellow. Brown (#664400) is dark red-plus-green, so it becomes dark yellow—which most terminals render as brown anyway, so that one actually works out.

The goal is not colour accuracy. I tried for colour accuracy and it was a mess. The goal is consistent tone—a C64 program that uses contrasting colours should still show contrast in the terminal, even if the specific hues shift. Whether I’ve achieved that, I’m not entirely sure. It looks okay to me, but I’ve been staring at it so long I might have lost perspective.

The VIC-20: Same Problem, Smaller Canvas

When I went to add VIC-20 support, I was bracing for a complete rewrite. Turns out it’s almost identical—same PETSCII encoding, same basic memory-mapped approach, just a smaller display (22×23 instead of 40×25) and a more restricted colour palette.

The entire difference in the colour handling comes down to one mask:

// web/src/video/vic20-text-adapter.ts
const colorIndex = this.colorData[offset] & 0x07;

That & 0x07 instead of & 0x0f encodes the VIC-20’s entire foreground colour constraint in one line. The VIC chip can only display 8 foreground colours; indices 8–15 are for background and auxiliary use only. I was expecting this to be complicated, and then it just… wasn’t. (Sometimes the old hardware designers knew what they were doing, even if it doesn’t feel like it when you’re reading the specs.)

The Things I Decided Not to Do

Some of the missing features are deliberate choices, and I should probably document them better than I have:

Reverse video. On real hardware, setting bit 7 of a screen RAM byte swaps foreground and background for that cell. I’m ignoring this because I treat the full byte as a lookup key. Implementing reverse video properly would require tracking per-cell state that the terminal can’t persist between renders. Could I work around this? Probably. But it would make the adapter stateful, and I really wanted to keep it as a pure function.

Character set switching. The C64 can toggle between uppercase/graphics and lowercase/uppercase character sets via a VIC-II register. I’m assuming the default set. Supporting both would double the lookup table size for a feature that BASIC programs rarely use. This might bite me later if someone tries to run something that expects the alternate set.

Control code execution. PETSCII control codes embedded in screen RAM (cursor movement, colour changes) just render as spaces. The adapter has no cursor to move and no colour state to change—each frame is a stateless translation. This is a limitation I’m genuinely unsure about. It might be wrong.

The point of listing these isn’t to apologize—it’s to make the boundaries visible. When something looks wrong, at least you’ll know whether it’s a bug or a known limitation.

Testing: How Do You Validate a Translation?

I built ten test patterns that stress different aspects of the adapter—horizontal stripes for colour RAM, vertical stripes for column rendering, checkerboard for spatial accuracy, that kind of thing. Each pattern draws to video memory, renders to ANSI, and validates the output numerically.

The checkerboard expects exactly 500 blocks. The full palette pattern expects at least 8 unique ANSI colour codes (C64’s 16 colours collapse to about 10 ANSI codes due to the lossy mapping). The border frame expects exactly 126 blocks (40×2 + 25×2 − 4 corners, if you want to check my math).

These tests aren’t about pixel-perfect fidelity—they can’t be, given what I’m building. They verify that the translation is stable: same input always produces same output, and the output preserves enough structure for a human to recognize what the C64 program drew. Whether “enough structure” is actually enough, I guess I’ll find out when people try to use this.

The Full Path (For Reference)

The journey from 6502 instruction to terminal glyph crosses several boundaries, and I’ve found it helpful to keep this mental map around:

The 6502 core (Rust, compiled to WASM) executes instructions and writes to memory-mapped video addresses. The video state struct holds screen RAM, colour RAM, registers, and that crucial dirty flag. The text adapter (TypeScript) reads the state and translates PETSCII to Unicode, C64 colours to ANSI codes. The virtual terminal buffers the output and emits ANSI escape sequences. Finally, the host terminal interprets those sequences and draws actual glyphs.

Each layer has a narrow contract. The core doesn’t know about ANSI. The adapter doesn’t know about escape sequences. The virtual terminal doesn’t know about PETSCII. This separation is what makes each piece testable in isolation—and I have to admit, it took me a while to appreciate why that matters. When something breaks, I know exactly which layer to blame.

What Building This Actually Clarified

I started this thinking I was building a video emulator, and what I ended up with is something more like a protocol translator. It doesn’t simulate a VIC-II chip—it just preserves enough of the original semantics for 6502 programs to be visible without a canvas.

The interesting constraint wasn’t technical but conceptual: deciding what to keep and what to throw away. Reverse video would be nice, but tracking per-cell state would violate the stateless render model I’d committed to. Control codes would be more authentic, but executing them requires a cursor that persists between frames, and I didn’t want that complexity. Each omission is a choice about what the adapter’s contract includes—and I found that naming those choices explicitly made the code much easier to reason about.

The result is not a faithful C64 reproduction. It’s something narrower and, for my purposes, more useful: a minimal bridge between 8-bit video memory and a modern terminal. When something looks wrong, I know exactly which layer to blame—and that clarity is worth more than any missing feature. (At least, that’s what I tell myself when someone asks why the graphics characters show up as question marks.)