Back to Articles
2026 / 02
| 5 min read

Deep Dive: Audio Sonification for Code Visualization

Turning code evolution into sound—Web Audio API oscillators, harmonic frequency mapping, and the pursuit of a THX-like evolving chord.

code-evolution-analyzer web-audio visualization javascript

Deep Dive: Audio Sonification for Code Visualization

Each programming language becomes a voice in an evolving chord


The Code Evolution Analyzer already had a compelling visual: animated charts showing language composition changing over time. But I wanted to add another dimension. What if you could hear the code evolving?

The idea: each programming language gets an oscillator. The more code in that language, the louder its voice. As the visualization plays through commits, you hear the codebase’s composition as a shifting chord. JavaScript growing? Its voice rises. Python shrinking? It fades.

This is called sonification—representing data through sound. It’s completely unnecessary for a code analyzer. I did it anyway.

The Basic Setup

Web Audio API provides the building blocks: oscillators, gain nodes, and filters. The initial implementation was straightforward:

// Initial audio setup
const FUNDAMENTAL_FREQ = 110;  // A2 - bass foundation
const MAX_VOICES = 16;         // One per language (matches color palette)

let audioCtx = null;
let voices = [];

function initAudio() {
  audioCtx = new (window.AudioContext || window.webkitAudioContext)();
  
  // Master gain for overall volume control
  const masterGain = audioCtx.createGain();
  masterGain.connect(audioCtx.destination);
  
  // Shared lowpass filter for warmth
  const filter = audioCtx.createBiquadFilter();
  filter.type = 'lowpass';
  filter.frequency.value = 2500;  // Cap the brightness
  filter.connect(masterGain);
  
  // Create oscillator pool
  for (let i = 0; i < MAX_VOICES; i++) {
    const osc = audioCtx.createOscillator();
    const gain = audioCtx.createGain();
    
    osc.type = 'sawtooth';
    osc.frequency.value = FUNDAMENTAL_FREQ * (i + 1);  // Harmonic series
    gain.gain.value = 0;  // Start silent
    
    osc.connect(gain);
    gain.connect(filter);
    osc.start();
    
    voices.push({ osc, gain });
  }
}

Sawtooth waves have rich harmonics, giving the sound presence. The lowpass filter at 2500Hz prevents harshness while keeping the character. Each language maps to a harmonic of the fundamental frequency—classic organ registration stuff.

The Mud Problem

This worked, but it sounded… bad. Really bad.

The issue: in most codebases, one or two languages dominate. Express.js repositories are 70-90% JavaScript. When JavaScript’s oscillator is that much louder than everything else, all you hear is a big bass drone with faint harmonics.

Worse, the harmonic series (1×, 2×, 3×, etc. the fundamental) puts all the major languages close together in pitch. JavaScript at 110Hz, Python at 220Hz, TypeScript at 330Hz—they all blur into one muddy low-frequency blob.

(I spent an embarrassing amount of time wondering why my speakers sounded broken. They weren’t.)

Multi-Octave Frequency Mapping

The fix was to spread languages across multiple octaves with carefully chosen intervals:

// Frequency mapping: Wide spacing for harmonic clarity
// Primary languages get very different registers
const VOICE_FREQUENCIES = [
  130.81,  // C3  - 1st language: Deep bass foundation
  329.63,  // E4  - 2nd: Bright, major third harmony
  392.00,  // G4  - 3rd: High, perfect fifth above E4
  261.63,  // C4  - 4th: Octave above primary
  196.00,  // G3  - 5th: Perfect fifth above C3
  440.00,  // A4  - 6th: Concert pitch, very bright
  293.66,  // D4  - 7th: Middle voice
  246.94,  // B3  - 8th: Leading tone feel
  174.61,  // F3  - 9th: Subdominant
  220.00,  // A3  - 10th: Low-mid range
  349.23,  // F4  - 11th: Upper-mid
  493.88,  // B4  - 12th: Highest voice
  155.56,  // Eb3 - 13th: Minor color
  311.13,  // Eb4 - 14th: Minor color octave
  277.18,  // C#4 - 15th: Sharp edge
  415.30   // G#4 - 16th: Sharp edge high
];

The top three languages (by code volume) now span over two octaves: C3 at 131Hz, E4 at 330Hz, G4 at 392Hz. They’re musically pleasant (thirds and fifths) but clearly distinct. When JavaScript grows, you hear bass rising. When TypeScript grows, you hear brightness emerging.

The intervals aren’t arbitrary—they’re based on common chord tones (C major extended). Even when multiple languages are loud simultaneously, they form something approximating a chord rather than noise.

The Thrumming Problem

Even with better frequency spacing, something was wrong. The sound had a rhythmic “thrumming”—a beating pattern that wasn’t in the data.

The culprit: phase interference. All oscillators started at the same moment, so their waveforms were phase-aligned. When two oscillators have nearly-related frequencies, their peaks and troughs create interference patterns. You hear this as a slow pulsing.

// Before: All oscillators start at 0 phase
osc.frequency.value = frequency;
osc.detune.value = 0;

// After: Random phase offset via slight detuning
osc.frequency.value = frequency;
// Add ±3 cents of random detuning (imperceptible pitch change)
const baseDetune = (Math.random() - 0.5) * 6;
osc.detune.value = baseDetune;

A few cents of random detuning is too small to hear as pitch variation, but it’s enough to decorrelate the phase relationships. The thrumming disappeared.

Perceptual Gain Scaling

Human hearing isn’t linear. A sound at 10% gain doesn’t sound half as loud as one at 20%—it sounds much quieter. To make smaller languages actually audible, I needed perceptual scaling:

// Linear gain: 10% codebase = 10% volume (nearly inaudible)
// With power curve: 10% codebase = 63% volume (clearly audible)
const audioSettings = {
  gainCurvePower: 0.4  // Lower = more boost for quiet voices
};

function updateAudio() {
  // Raw gain is proportion of codebase (0-1)
  const rawGain = languageLines / totalLines;
  
  // Power curve boosts quiet languages
  // x^0.4 means 0.1 → 0.25, 0.2 → 0.38, 0.5 → 0.66
  const perceivedGain = Math.pow(rawGain, audioSettings.gainCurvePower);
  
  voice.gain.gain.linearRampToValueAtTime(perceivedGain, rampEnd);
}

This makes minor languages contribute meaningfully to the sound. You can hear when that 5% of Markdown documentation appears or disappears.

Dynamic Pitch Variation

The sound was still too static. A language at 30% stays at constant volume—boring. I added dynamic pitch variation based on growth trend:

// Detune based on growth rate between commits
const prevValue = prevCommit?.languages[lang]?.code || 0;
const value = commit.languages[lang]?.code || 0;

let detune = 0;
if (prevValue > 0) {
  // Growth rate: -1 to +∞, but we cap at ±50 cents
  const growthRate = (value - prevValue) / prevValue;
  detune = Math.max(-50, Math.min(50, growthRate * 50));
} else if (value > 0) {
  // New language appeared - slight pitch up
  detune = 25;
}

voice.osc.detune.linearRampToValueAtTime(baseDetune + detune, rampEnd);

Growing languages pitch up slightly. Shrinking languages pitch down. New languages arrive with a subtle rising tone. It adds life to the sound without being obvious.

Stereo Field

Sixteen voices in mono get crowded. Adding stereo spread separates them:

// Create stereo panner for each voice
const panner = audioCtx.createStereoPanner();

// Spread across stereo field: first voice far left, last far right
const basePan = (i / (MAX_VOICES - 1)) * 2 - 1;  // -1 to +1
panner.pan.value = basePan * audioSettings.stereoWidth;

osc.connect(gain);
gain.connect(panner);
panner.connect(filter);

At 70% stereo width (the default), dominant languages spread across the soundstage. JavaScript on the left, Python right, TypeScript center-right. Your ears can now distinguish them even when they’re similarly loud.

The Pulsing Bug

Late in development, users reported “pulsing”—the sound would briefly dip every frame even when nothing changed.

The bug was in my update logic:

// BUGGY: Silence all voices, then set active ones
function updateAudio() {
  // First, silence everyone
  for (const voice of voices) {
    voice.gain.gain.linearRampToValueAtTime(0, rampEnd);
  }
  
  // Then set active voices
  for (const [lang, data] of activeLanguages) {
    const voice = voices[langIndex];
    voice.gain.gain.linearRampToValueAtTime(data.gain, rampEnd);
  }
}

See the problem? Every voice receives two ramp commands per frame: first to zero, then to its target. When frames update faster than the 50ms ramp time, voices briefly dip toward zero before recovering. You hear this as pulsing.

The fix: only send ramp commands to voices that are actually changing:

// FIXED: Track which voices are active, only modify inactive ones
function updateAudio() {
  const activeVoices = new Set();
  
  // Collect active voice indices
  for (const [lang, data] of activeLanguages) {
    activeVoices.add(langIndex);
  }
  
  // Update all voices: active get target, inactive get silence
  for (let i = 0; i < voices.length; i++) {
    const voice = voices[i];
    
    if (activeVoices.has(i)) {
      // Active: set gain directly (no conflicting silence command)
      voice.gain.gain.linearRampToValueAtTime(activeGain[i], rampEnd);
    } else {
      // Inactive: fade to silence
      voice.gain.gain.linearRampToValueAtTime(0, rampEnd);
    }
  }
}

Active voices receive one command. Inactive voices receive one command. No conflicts, no pulsing.

Pre-Computing Audio Data

All this processing—sorting languages, computing gains, calculating detunes—was happening in the browser at 30fps. That’s wasteful. The data doesn’t change; why recalculate it every frame?

I moved the audio calculations to build time:

// At build time: compute all audio state
function computeAudioData(results, allLanguages) {
  const audioData = [];
  
  for (const commit of results) {
    const frameData = {};
    
    // For each metric (lines, files, bytes)
    for (const metric of ['code', 'files', 'bytes']) {
      const total = /* sum of metric across languages */;
      const activeVoices = [];
      
      for (let v = 0; v < MAX_VOICES; v++) {
        const lang = allLanguages[v];
        const value = commit.languages[lang]?.[metric] || 0;
        const gain = total > 0 ? value / total : 0;
        
        if (gain === 0) continue;  // Sparse: skip silent voices
        
        const detune = /* calculated from growth rate */;
        activeVoices.push([v, 
          Math.round(gain * 100) / 100,      // 2 decimal precision
          Math.round(detune * 10) / 10       // 1 decimal precision
        ]);
      }
      
      frameData[metric] = [masterIntensity, ...activeVoices];
    }
    
    audioData.push(frameData);
  }
  
  return audioData;
}

The sparse format (only storing non-zero voices) reduces data size by 22%. The browser just reads pre-computed arrays at playback—no sorting, no division, no conditionals per frame.

Was It Worth It?

From a practical standpoint? Absolutely not. Nobody needs audio to understand code history.

From a learning standpoint? Completely worth it. I now understand:

  • Web Audio API timing and parameter automation
  • Perceptual audio scaling
  • Phase interference and detuning
  • Sparse data formats for audio streams
  • Why musicians care about frequency ratios

And the result is legitimately cool. Watching a codebase grow while hearing it transform from a single bass note into a rich chord—that’s satisfying in a way charts alone can’t match.

Turn on the sound at analyze.devd.ca. You’ll see.


See also: Building a Code Evolution Analyzer in a Weekend — the full project story