Deep Dive: Audio Bus Architecture

The master volume slider only worked for some of the sounds. I noticed it during a demo—the dial tone respected the knob, but the floppy drive’s seek clatter played at full volume regardless. That single discrepancy bothered me more than it probably should have, because it meant the audio system wasn’t really a system at all, just a pile of components that happened to make noise.

If you ended up here hoping I have a clean architecture to copy, I sort of do now, but I want to be honest about how messy it was before. Because the “before” is actually more instructive than the “after.”

The Mess I Started With

Before the bus architecture, every component that wanted to make sound did its own thing. The modem panel created its own AudioContext. The floppy drive peripheral had a PeripheralAudioHelper that made another AudioContext. The line tones went through Tone.js. Some generators connected to the Tone.js destination, some went straight to the Web Audio destination, and the Jenny backend just played WAV files wherever it felt like. (I’m honestly not sure how this even worked for as long as it did.)

The result was that each component had its own volume, its own path to the speakers, its own ideas about what “loud” meant. When I added the volume knob to the modem panel, it controlled… some of the sounds. The polite ones. The dial tone would get quieter, but the floppy drive would keep clattering at full blast, and if you were wearing headphones during a demo—well, I learned not to wear headphones during demos.

I also had this elaborate pub/sub system where audio events bounced through what turned out to be eight hops before reaching a generator. Line emits an event, line notifies callbacks, modem receives it, modem re-emits it unchanged (why?), audio coordinator forwards it to the EventBus, EventBus iterates through subscribers (of which there was exactly one), the subscriber runs a switch statement, and finally a generator plays the sound. I drew this out once and realised the Modem layer was just passing events through without doing anything. Pure overhead.

What I Actually Wanted

Here’s the thing: I didn’t set out to redesign the audio architecture. I just wanted the volume knob to work. But once I started tracing why the floppy drive ignored the master volume, I kept finding more components that had their own audio paths, and eventually I had to admit that the system needed a proper spine.

The goal was simple, even if getting there wasn’t: every sound in the emulator should pass through a single master gain node before reaching the speakers. One knob, everything responds. That’s it.

The Architecture at a Glance

The shape I ended up with is straightforward—generators connect to buses, buses pass through filters, filters converge on a master chain:

Generators → Buses → Bus Filters → Master Reverb → Master Gain → Destination

I have four buses now, and honestly I’m not certain four is the right number, but it’s been working. There’s the LineBus for central-office tones—dial tone, ringback, busy signal—basically anything that would come over the phone line from the telephone company’s equipment. The ModemBus handles DTMF digits and handshake audio, the sounds you’d hear from the modem’s speaker. The BackendBus carries backend-generated audio like Jenny’s voice, and the PeripheralBus handles mechanical sounds—motors spinning up, disk seeks, the clatter of a line printer.

Each bus has its own filters to constrain the character of its sounds. The line bus gets a 4kHz lowpass because that’s roughly the bandwidth of a phone line. The modem bus gets a 3.5kHz filter plus some EQ because modem speakers were tiny and terrible. (I spent way too long tweaking the EQ settings trying to get that authentic “sound coming from a small speaker in a plastic box” quality. Whether I succeeded is debatable.) The peripheral bus has an 8kHz filter, and the backend bus runs clean because pre-recorded speech doesn’t need me colouring it further.

The Master Chain

The AudioStateManager builds the entire chain in one place, which in retrospect seems obvious but took me longer than I’d like to admit to realise was necessary. Nothing calls Tone.start() until user interaction allows it—the graph is constructed immediately, but the context starts lazily. (Browser audio autoplay policies are one of those things I keep having to re-learn.)

// web/src/audio/audio-state-manager.ts
this.masterGain = new Tone.Gain(gainCoef).toDestination();
this.masterReverb = new Tone.Reverb({ decay: 1.5, preDelay: 0.01, wet: 0.15 })
  .connect(this.masterGain);

this.lineFilter = new Tone.Filter({ frequency: 4000, type: 'lowpass', rolloff: -12 })
  .connect(this.masterReverb);
this.lineBus = new Tone.Gain(1).connect(this.lineFilter);

this.modemEQ = new Tone.EQ3({ low: -3, mid: 0, high: -6, lowFrequency: 400, highFrequency: 2500 })
  .connect(this.masterReverb);
this.modemFilter = new Tone.Filter({ frequency: 3500, type: 'lowpass', rolloff: -24 })
  .connect(this.modemEQ);
this.modemBus = new Tone.Gain(1).connect(this.modemFilter);

this.backendBus = new Tone.Gain(1).connect(this.masterReverb);

this.peripheralFilter = new Tone.Filter({ frequency: 8000, type: 'lowpass', rolloff: -12 })
  .connect(this.masterReverb);
this.peripheralBus = new Tone.Gain(1).connect(this.peripheralFilter);

The master gain defaults to -30dB in the constructor, which is roughly 3% amplitude. That might seem absurdly quiet, but I learned the hard way that web audio bugs are loud when they fail—a stuck oscillator at full volume will damage your relationship with your headphones and possibly your ears. Generators also run at conservative levels (-18dB is the common baseline in the config) to keep headroom for mixing. When four or five sounds play simultaneously, you want room to breathe, and you really want room for the inevitable moment when something goes wrong.

Killing the Pub/Sub Indirection

The pub/sub layer was the part that annoyed me most. I’d set it up thinking it was good architecture—decoupling! events! clean separation!—but in practice it was just a place for bugs to hide.

Events would arrive out of order. Handlers would miss teardown and keep playing sounds after the call ended. The indirection made tracing the audio flow genuinely difficult; I’d be looking at a dial tone that wouldn’t stop and have to trace through Line to Modem to AudioEventBus to AudioCoordinator to the generator, and somewhere in that chain was the bug, but good luck finding it quickly.

I replaced the whole thing with a direct subscription. Three hops total:

// web/src/main/modules/audio-coordinator.ts
const line: Line = connectionManager.getLine();
const handleAudioEvent = createAudioEventHandler({ /* handlers */ });
line.onAudio((event: LineAudioEvent) => {
  handleAudioEvent(event);
});

Line emits an event, coordinator receives it, generator plays the sound. That’s it. The indirection is gone. When a dial tone doesn’t stop now, I know exactly where to look—there are only three places it could be. Whether this will scale if I need more complex audio behaviour later, I honestly don’t know. But for now it’s much easier to debug, and debuggability matters more to me than theoretical purity.

PeripheralAudioHelper: Making Peripherals Portable

One problem I ran into is that peripherals need to work both inside the full emulator and in standalone demo pages. The floppy drive can’t assume the bus infrastructure exists—it might be running on a test page where there’s no AudioStateManager, no master chain, nothing.

The PeripheralAudioHelper handles this by trying the bus first and falling back to direct output:

// web/src/peripheral/devices/shared/audio-helper.ts
const bus = peripheralBus || globalPeripheralBus.getBus();

if (bus) {
  this.outputNode.connect(bus);
} else {
  this.outputNode.toDestination();
}

The helper also subscribes to global volume changes, so a floppy drive respects the same master volume as the modem speaker whether or not the full bus infrastructure is present. This fallback pattern means peripherals stay portable—useful when I want to test a cassette deck in isolation without spinning up the entire emulator. I’m not totally happy with the “try bus, fall back to destination” pattern (it feels a bit magical), but it works and I haven’t thought of anything cleaner.

Backend Audio

The Jenny backend now routes through the backend bus using the same pattern. If the bus is missing, it falls back to destination. The backend doesn’t own routing; it asks for a bus and trusts the global system to provide one.

I like this consistency—every audio source asks the same question (“do you have a bus for me?”) and gets the same shape of answer. Whether it’s actually good design or just consistent design, I’m not sure, but at least it’s predictable.

The Constraint That Forced All This

Looking back, the constraint was never “better architecture.” The constraint was consistent volume. A modem handshake and a floppy seek should be the same volume relative to the master, regardless of which component initialised first, regardless of which AudioContext they originally targeted. Without buses, that’s just not possible to guarantee.

Once everything goes through a bus, the master gain actually means something. That’s the smallest truth I needed, and the rest of the audio system now hangs off it. The filters, the reverb, the state machine—those are conveniences. The bus is the constraint.

What This Enables (and What I Didn’t Expect)

The audio now feels like a system instead of a pile of generators. You can still hear a floppy seek and a dial tone at the same time, but they no longer fight for ownership of the speakers. When something is too loud, I know where to look—the generator level, the bus level, or the master. When I want to add a new peripheral, I don’t need to study the volume implementation; I just connect to the bus and the volume follows.

The architecture also made room for something I didn’t anticipate: live tuning. Because all the filter frequencies and EQ gains live in one place, I can expose them in a debug panel and adjust them while the emulator runs. I can make the dial tone sound more telephone-like or less, hear the difference immediately, tweak it again. That kind of iteration would have been painful when each component owned its own audio chain. (I spent an embarrassing amount of time adjusting the modem EQ to get that “tiny speaker in a plastic case” sound. Whether anyone notices or cares, I don’t know, but I notice.)

The trade-off is coupling—every audio source now depends on the bus infrastructure. But that coupling is intentional. The alternative was freedom that nobody wanted: each component free to choose its own volume, its own context, its own path to the speakers. That freedom produced chaos. The buses produce a mix.

Maybe I’ll regret this design when I need to do something the buses can’t accommodate. But for now, the volume knob works on everything, and that’s all I really wanted.