Deep Dive: Bell 103 Audio Modem and FSK Implementation

If you’re here hoping to find a clean explanation of how Bell 103 works, I should warn you: I didn’t start with a clean understanding. I started with a Rust prototype in one window, a spectrogram in another, and a lot of confusion about why my modems couldn’t hear each other.

The goal was to make the emulator’s modem warble—not play a sample, but actually generate the tones that made 300 baud dialup sound like dialup. I wanted the time machine experience, which meant real FSK modulation, real demodulation, and hours of listening to tones trying to figure out why some worked and others didn’t. (My neighbour came home during one of these sessions and asked if my computer was possessed. Fair question.)

Before I got anything working in the actual emulator, I built a standalone test harness in experimental/1200fsk/—just two modem objects talking to each other through arrays of audio samples. That was the right call. Trying to debug FSK while also debugging the serial layer and the audio routing would have been miserable.

How Bell 103 Keeps Two Conversations Apart

Bell 103 is full duplex at 300 baud, which sounds simple until you realize both sides are transmitting and receiving on the same phone line simultaneously. The trick is that they speak in different frequency bands—the originate modem (the one that dialed) transmits at 1270 Hz for a binary 1 (mark) and 1070 Hz for a binary 0 (space), while the answer modem transmits higher at 2225 Hz (mark) and 2025 Hz (space). Each side listens in the other’s band, so the originate modem transmits low and listens high, while the answer modem transmits high and listens low.

It’s elegant once you see it, but I didn’t see it at first. I wired both modems to the originate frequencies—both transmitting at 1270/1070 and listening at 1270/1070. They could hear themselves beautifully, but not each other. I stared at spectrogram plots for longer than I’d like to admit before realizing the cross‑wiring was the entire duplex trick.

In code, the modulator and demodulator split by role. The modulator uses its transmit pair; the demodulator listens on the other pair:

// web/src/serial/modems/fsk/fsk-modulator.ts
static createBell103Originate(sampleRate: number = 8000): FSKModulator {
  return new FSKModulator({
    sampleRate,
    markFrequency: 1270,
    spaceFrequency: 1070,
    baudRate: 300,
  });
}

static createBell103Answer(sampleRate: number = 8000): FSKModulator {
  return new FSKModulator({
    sampleRate,
    markFrequency: 2225,
    spaceFrequency: 2025,
    baudRate: 300,
  });
}

// web/src/serial/modems/fsk/fsk-demodulator.ts
static createBell103Originate(sampleRate: number = 8000, debug: boolean = false): FSKDemodulator {
  return new FSKDemodulator({
    sampleRate,
    markFrequency: 2225,
    spaceFrequency: 2025,
    baudRate: 300,
  }, debug);
}

static createBell103Answer(sampleRate: number = 8000, debug: boolean = false): FSKDemodulator {
  return new FSKDemodulator({
    sampleRate,
    markFrequency: 1270,
    spaceFrequency: 1070,
    baudRate: 300,
  }, debug);
}

That asymmetry—originate transmits low, listens high; answer transmits high, listens low—is the whole thing. I keep reminding myself of it because I’ve wired it wrong more than once.

FSK Modulation: Bits Into Sine Waves

The modulator takes serial frames, not raw bytes, which means each byte becomes a start bit (0), eight data bits (LSB first), and a stop bit (1). At 300 baud with an 8000 Hz sample rate, that works out to about 26.7 samples per bit—enough for clean tone transitions without making the CPU sweat, though I honestly wasn’t sure if that would be sufficient when I started.

The detail that took me a while to get right was phase continuity. My first version reset the phase to zero on every frequency change, which seemed tidy, but it created audible clicks at every bit boundary—sharp transients that confused the Goertzel detector on the other end. I could hear the problem before I could diagnose it: the audio had this faint crackling that shouldn’t have been there. The fix was to keep phase continuous across frequency changes—the oscillator keeps spinning, only the frequency changes:

// web/src/serial/modems/fsk/fsk-modulator.ts
public modulate(data: Uint8Array): Float32Array {
  const bits: boolean[] = [];

  for (let i = 0; i < data.length; i++) {
    const byte = data[i];
    bits.push(false); // start bit
    for (let bit = 0; bit < 8; bit++) {
      bits.push((byte & (1 << bit)) !== 0);
    }
    bits.push(true); // stop bit
  }

  const totalSamples = Math.ceil(bits.length * this.samplesPerBit);
  const samples = new Float32Array(totalSamples);

  let sampleIndex = 0;
  for (let bitIndex = 0; bitIndex < bits.length; bitIndex++) {
    const bit = bits[bitIndex];
    const frequency = bit ? this.config.markFrequency : this.config.spaceFrequency;
    const samplesInBit = Math.ceil(this.samplesPerBit);

    for (let i = 0; i < samplesInBit && sampleIndex < totalSamples; i++) {
      samples[sampleIndex] = Math.sin(this.phase);
      this.phase += (2 * Math.PI * frequency) / this.config.sampleRate;
      if (this.phase > 2 * Math.PI) {
        this.phase -= 2 * Math.PI;
      }
      sampleIndex++;
    }
  }

  return samples;
}

This is not a hi‑fi synthesizer; it’s just a deterministic oscillator that never drops a bit boundary. The phase variable persists across calls, so consecutive modulate() invocations produce a continuous waveform. I’m honestly not sure if that’s strictly necessary—maybe the demodulator would handle the discontinuities fine—but it made the spectrogram look right, and that was enough to stop me from questioning it.

Goertzel Demodulation: One DFT Bin, Twice

On the receive side, I needed to look at one bit’s worth of samples and answer two questions: how much mark energy is here, and how much space energy? A full FFT would work, but it’s wasteful when you only care about two specific frequencies. The Goertzel algorithm computes a single DFT bin without paying for the full transform, which seemed like a good fit.

I’ll admit I didn’t fully understand Goertzel when I started using it. I copied the standard formula, precomputed the coefficients at construction time—markOmega, spaceOmega, markCoeff, spaceCoeff—and it mostly worked. The “mostly” took another day to track down.

// web/src/serial/modems/fsk/fsk-demodulator.ts
private decodeBit(samples: Float32Array): boolean {
  const N = this.bufferIndex;
  const markPower = this.goertzel(samples, N, this.markOmega, this.markCoeff);
  const spacePower = this.goertzel(samples, N, this.spaceOmega, this.spaceCoeff);
  return markPower > spacePower;
}

One deliberate simplification I should mention: this demodulator doesn’t do clock recovery. It assumes the sample rate and baud rate are perfectly aligned. That’s fine here because both ends are inside the emulator and the audio pipeline is sample‑accurate—no jitter, no drift. A real phone line would need adaptive timing, but I decided that was complexity I didn’t want to import. Whether that was the right call, I’m still not sure. It works, but it makes me slightly nervous.

Framing: Finding Bytes in Bits

Once I have a bit stream, I still need to find bytes in it. The demodulator uses a minimal frame detector: wait for a start bit (a 0), clock eight data bits LSB first, then verify a stop bit (a 1). If the stop bit is missing, the frame gets thrown away and we start over.

// web/src/serial/modems/fsk/fsk-demodulator.ts
private processBit(bit: boolean): void {
  if (!this.inFrame) {
    if (!bit) {
      this.inFrame = true;
      this.currentByte = 0;
      this.bitIndex = 0;
    }
  } else {
    if (this.bitIndex < 8) {
      if (bit) {
        this.currentByte |= 1 << this.bitIndex;
      }
      this.bitIndex++;
    } else {
      if (bit) {
        this.byteBuffer.push(this.currentByte);
        if (this.byteBuffer.length >= 1) {
          this.emitData();
        }
      }
      this.inFrame = false;
      this.currentByte = 0;
      this.bitIndex = 0;
    }
  }
}

This is probably the smallest true frame model I could get away with. It works because the audio path is deterministic—no jitter, no dropped samples, no reason to hunt for sync. On a real phone line with noise and timing drift, you’d want something more robust. But I didn’t have a real phone line, so I didn’t build that.

The Handshake: 0x55, 0xAA, and a 5‑Second Wait

The audio modem uses a simple alternating pattern to lock timing and confirm that both sides speak the same standard:

// web/src/serial/audio-modem.ts
private readonly HANDSHAKE_PATTERN = new Uint8Array([0x55, 0xaa, 0x55, 0xaa]);
private readonly HANDSHAKE_ACK = new Uint8Array([0xaa, 0x55, 0xaa, 0x55]);

Why 0x55 and 0xAA? In binary, they’re 01010101 and 10101010—maximally alternating bit patterns that force frequent mark/space transitions. That exercises both frequencies equally and makes the pattern easy to detect, even if you’re slightly out of sync. I didn’t invent this; it’s what real modems did.

The handshake timeout is 5000 ms, which I increased from 3000 ms after the handshake audio itself turned out to be about 4200 ms. The number matters because the modem can’t be instant and still feel like a modem—you need the warble to actually play—but it also can’t be so slow that failed connections feel broken. I’m not entirely happy with using a fixed timeout instead of signaling from the audio system, but it works.

One bug that took me longer to find than I’d like: during handshake, I accumulated incoming bytes in a buffer looking for the pattern. But if the pattern didn’t arrive—wrong role, noise, whatever—the buffer just kept growing. Failed connections would leak memory. The fix was simple once I saw it: trim the buffer when it gets too large, keep the last 50 bytes, discard the rest.

The Adapter: Where Serial Meets Audio

The adapter is the impedance matcher between serial bytes and audio samples. It owns the AudioModem, feeds it bytes from the serial port, pulls audio for the line, and pushes incoming audio back through the demodulator. It’s the only piece that has to understand both clocks—the byte clock from the serial port and the sample clock from the audio path.

// web/src/serial/audio-modem-adapter.ts
receiveAudioFromLine(samples: Float32Array): void {
  if (this.audioModem) {
    this.audioModem.processAudio(samples);
  }
  if (this.audioPlayback) {
    this.audioPlayback.playRxAudio(samples);
  }
}

getAudioForLine(): Float32Array | null {
  if (!this.audioModem) {
    return null;
  }
  const samples = this.audioModem.getAudioOutput();
  if (samples && this.audioPlayback) {
    this.audioPlayback.playTxAudio(samples);
  }
  return samples;
}

I spent a while trying to figure out whether to put more logic in the adapter or push it down into the modem itself. I ended up keeping the adapter thin—it just wires things together—and letting the modem handle its own state. That made each side easier to test in isolation, though I’m not sure it was the only reasonable choice.

Audio Playback: The Debugging Tool That Stayed

The audio playback layer started as a debugging tool. When I couldn’t figure out why bits were getting mangled, I piped the audio through speakers and listened. TX pans left, RX pans right, volume defaults to 0.3 so the tones are audible but not punishing. I could literally hear when the handshake succeeded versus failed—the rhythm changes, the pitches align or don’t.

It turned out to be useful enough to keep as a feature. When the modem is connected, you hear the warble, and that sound carries information. There’s something deeply satisfying about debugging a 1962 protocol by ear.

What Went Wrong Along the Way

I mentioned the frequency band wiring and the phase reset problem earlier, but let me list out the bugs more explicitly, because I think they’re instructive:

Phase resets killed demodulation. I was resetting phase on every frequency change, thinking it would keep things mathematically clean. Instead, every bit boundary had a click—a sharp transient that the Goertzel detector interpreted as energy at all frequencies. The fix was trivial once I understood it: keep phase continuous, only change the frequency. But I didn’t understand it for a while.

Both modems were shouting into the same band. This one was embarrassing. I had the originate modem transmitting at 1270/1070 and listening at… 1270/1070. Same for the answer modem. Each modem could hear itself perfectly, but they couldn’t hear each other at all. I kept checking the modulator, the demodulator, the sample rate, the Goertzel coefficients—everything except the obvious fact that I’d wired them to the same frequencies. The spectrogram eventually made it clear, but not before I’d wasted most of an afternoon.

The handshake buffer grew unbounded. During handshake, I accumulated incoming bytes looking for the 0x55/0xAA pattern. If the pattern never arrived, the buffer just grew. And grew. Failed connections would accumulate garbage until the next successful connection cleared it. The fix was a simple length check: when the buffer exceeds 100 bytes, keep the last 50 and discard the rest.

Handshake timeout was too short. I originally set it to 3000 ms, but the actual Bell 103 handshake audio takes about 4200 ms to play through. So the connection would time out before the audio finished, which looked like a mysterious failure. I bumped the timeout to 4500 ms and added a comment noting that this should really be signaled from the audio system instead of being a magic number.

Each of these bugs was fundamentally a misunderstanding of the system—the frequencies, the phase, the timing. Once I got the mental model right, the fixes were all straightforward.

How It All Fits Together

Serial Port → AudioModemAdapter → AudioModem → FSK Mod/Demod → Line (Audio)

Bell 103 is one layer in that chain, but it’s the layer that forces the system to be honest about timing, framing, and frequency separation. The warble isn’t a sound effect added for atmosphere—it’s the artefact of those constraints actually being enforced.

Bell 103 is slow enough that I can listen to my bits—26.7 samples per bit, 300 bits per second, and every one of them audible if I pipe the output through speakers. There’s something grounding about that. When something goes wrong, I can hear it before I see it in the logs. And when it works, the warble is weirdly satisfying—proof that the protocol is actually running, not just being simulated.

I’m not sure I’d recommend this approach for anything practical. But for making an emulator feel like a time machine? It was worth the debugging.