The JavaScript Web Audio API: Audio Processing and Synthesis
Overview
The JavaScript Web Audio API is a high-level browser API for processing and synthesising audio. Where the <audio> element plays back recorded files, the Web Audio API lets you build audio from scratch: synthesisers, drum machines, real-time effects chains, and visualisations.
The architecture is a directed graph of audio nodes. You connect a source node to processing nodes (gain, filters, effects) and finally to a destination (usually your speakers). Every node does one job, and chaining them together is what makes the API powerful.
The AudioContext
The AudioContext is the foundation. It represents the audio processing graph and owns all the nodes. Create one when you need audio:
const audioCtx = new AudioContext();
Modern browsers suspend the audio context automatically to comply with autoplay policies. Until the user clicks, taps, or presses a key, the context stays in a "suspended" state and produces no sound. Always check the state and resume explicitly inside a user-gesture handler:
const audioCtx = new AudioContext();
if (audioCtx.state === "suspended") {
audioCtx.resume();
}
Once the context is running, you can load audio from a URL, decode it into a buffer, and play it through a source node. This is the most direct way to get audio into the graph and the starting point for most playback workflows.
Playing a sound file
Load and play an audio file through a buffer source. The fetch call grabs the raw bytes, decodeAudioData converts them into a playable AudioBuffer, and createBufferSource wraps that buffer into a node you can connect:
async function playSound(url) {
const response = await fetch(url);
const arrayBuffer = await response.arrayBuffer();
const audioBuffer = await audioCtx.decodeAudioData(arrayBuffer);
const source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioCtx.destination);
source.start(0);
}
The AudioBufferSourceNode plays back the decoded audio. Once started, you can’t restart it; create a new source node for each playback. This one-shot design keeps the API predictable: each source node represents a single play-through of its buffer, and you create a fresh one every time you need to hear the sound again.
Creating sound with oscillators
The OscillatorNode generates a tone at a specific frequency. It is the building block of synthesis, useful for everything from simple beeps to layered synth patches. The type property selects the waveform shape, and the frequency AudioParam sets the pitch in hertz:
const osc = audioCtx.createOscillator();
osc.type = "sine"; // sine, square, sawtooth, triangle
osc.frequency.value = 440; // A4 in Hz
osc.connect(audioCtx.destination);
osc.start();
osc.stop(audioCtx.currentTime + 1); // stop after 1 second
The oscillator runs until you call stop(). Without stopping it, it plays forever, so always schedule a stop time. AudioParam methods let you schedule precise frequency changes ahead of time rather than setting values synchronously. The browser honours the schedule even if your main thread is busy, which prevents timing jitter during melodies and sweeps:
osc.frequency.setValueAtTime(440, audioCtx.currentTime); // at now
osc.frequency.setValueAtTime(880, audioCtx.currentTime + 0.5); // at +0.5s
osc.frequency.linearRampToValueAtTime(1760, audioCtx.currentTime + 1); // glide
Controlling volume with GainNode
GainNode controls volume. Its gain parameter is an AudioParam, so you do not set it directly; you schedule values on it. This lets you create smooth fades, envelopes, and ducking effects without writing your own ramp logic. The scheduling API is the same AudioParam interface you used with oscillator frequency:
const gain = audioCtx.createGain();
gain.gain.setValueAtTime(0, audioCtx.currentTime);
gain.gain.linearRampToValueAtTime(1, audioCtx.currentTime + 0.1); // fade in
gain.gain.setValueAtTime(1, audioCtx.currentTime + 2);
gain.gain.linearRampToValueAtTime(0, audioCtx.currentTime + 3); // fade out
With the gain envelope defined, the next step is placing the GainNode in the signal chain. Connect the oscillator into the gain node, then the gain node into the destination. The order matters: the oscillator produces the raw signal, the gain scales it, and the destination sends it to the speakers:
osc.connect(gain);
gain.connect(audioCtx.destination);
The audio graph
Audio flows through a graph of connected nodes. A practical setup combines an oscillator, a gain node for volume control, and an analyser for visual feedback. The connect() calls define the signal path, and data flows from left to right through each node in the chain:
const osc = audioCtx.createOscillator();
const gain = audioCtx.createGain();
const analyser = audioCtx.createAnalyser();
osc.connect(gain);
gain.connect(analyser);
analyser.connect(audioCtx.destination);
osc.start();
Multiple oscillators can feed into a single gain. A mixer is just several sources connected to one gain node. This is how you layer synth voices or blend a backing track with live input before the signal reaches the speakers.
AnalyserNode for visualisations
The AnalyserNode captures frequency and time-domain data for visualisations. It does not modify audio; it just exposes it for analysis. Place it anywhere in the graph to inspect the signal at that point. The fftSize determines how many frequency bins you get, and getByteFrequencyData fills a Uint8Array with amplitude values from 0 to 255:
const analyser = audioCtx.createAnalyser();
analyser.fftSize = 2048;
const bufferLength = analyser.frequencyBinCount;
const dataArray = new Uint8Array(bufferLength);
function draw() {
analyser.getByteFrequencyData(dataArray);
// draw the frequency data array on a canvas
requestAnimationFrame(draw);
}
osc.connect(analyser);
osc.start();
draw();
getByteFrequencyData() fills the array with values from 0 to 255 for each frequency bin. getByteTimeDomainData() gives you waveform data instead, useful for drawing oscilloscope-style visualisations of the raw audio signal.
AudioWorklet for custom processing
The AudioWorklet replaces the deprecated ScriptProcessorNode. It runs your audio processing code in a separate thread, which is critical for real-time audio without causing audio glitches. The worklet code lives in its own file and defines a processor class with a process() method that the audio thread calls for each block of samples:
// worklet-processor.js
class NoiseGateProcessor extends AudioWorkletProcessor {
process(inputs, outputs, parameters) {
const input = inputs[0];
const output = outputs[0];
const threshold = parameters.threshold[0] ?? 0;
for (let channel = 0; channel < input.length; channel++) {
const inputChannel = input[channel];
const outputChannel = output[channel];
for (let i = 0; i < inputChannel.length; i++) {
outputChannel[i] = Math.abs(inputChannel[i]) > threshold
? inputChannel[i]
: 0;
}
}
return true;
}
}
registerProcessor("noise-gate", NoiseGateProcessor);
After writing the processor, load it into the audio context with addModule(). Then create an AudioWorkletNode referencing the registered processor name and connect it into the graph just like any other node. The processor runs on the audio rendering thread, so the main thread stays free for UI updates while the worklet handles real-time audio manipulation:
await audioCtx.audioWorklet.addModule("worklet-processor.js");
const noiseGate = new AudioWorkletNode(audioCtx, "noise-gate");
mic.connect(noiseGate);
noiseGate.connect(audioCtx.destination);
The process() method is called by the audio thread, so keep it fast and avoid allocating memory in it. Every allocation inside process() adds pressure to the garbage collector, which can cause audible dropouts on the real-time thread.
Loading audio files
For larger audio files, stream them from a URL using fetch and decodeAudioData. The async pattern returns a decoded AudioBuffer you can pass directly to a buffer source node. This works well for samples, sound effects, and short clips:
async function loadAudio(url) {
const response = await fetch(url);
const arrayBuffer = await response.arrayBuffer();
return audioCtx.decodeAudioData(arrayBuffer);
}
For very long files like music tracks or podcasts, loading the entire file into memory is wasteful. Instead, use a MediaElementAudioSourceNode to route an existing <audio> element through the Web Audio graph. This streams the audio progressively and applies effects in real time without buffering the whole file:
const audioElement = document.querySelector("audio");
const source = audioCtx.createMediaElementSource(audioElement);
source.connect(audioCtx.destination);
This routes the <audio> element through the Web Audio API graph so you can process it with effects. The <audio> element handles buffering and playback control while the API graph applies filters, gain, and analysis on top.
BiquadFilterNode for EQ and effects
BiquadFilterNode provides common filter types for shaping the frequency content of a signal. Choose the filter type that matches your goal: lowpass rolls off high frequencies for a warmer sound, highpass removes rumble, and bandpass isolates a narrow range. The frequency and Q parameters control the cutoff point and resonance:
const filter = audioCtx.createBiquadFilter();
filter.type = "lowpass"; // lowpass, highpass, bandpass, peaking, etc.
filter.frequency.value = 1000; // cutoff frequency in Hz
filter.Q.value = 1; // resonance
osc.connect(filter);
filter.connect(audioCtx.destination);
For a simple low-pass filter smoothing out harsh frequencies, this is the node to reach for. Higher Q values create a resonant peak near the cutoff, which can add character but also risks harsh ringing if pushed too far.
Stereo panning
StereoPannerNode pans a signal left or right in the stereo field. The pan value ranges from -1 (full left) through 0 (centre) to 1 (full right). Connect the panner between your source and destination, and it handles the stereo positioning automatically:
const panner = audioCtx.createStereoPanner();
panner.pan.value = -1; // full left
panner.pan.value = 0; // centre
panner.pan.value = 1; // full right
Combine a panner with a gain node to create spatialised audio where sounds move across the stereo field while fading in and out. This pattern is common in game audio engines where each sound source needs independent position and volume control.
ConvolverNode for reverb
ConvolverNode applies a convolution reverb, perfect for room simulation or reverb effects. It takes an impulse response buffer that describes the acoustic space, then mathematically combines it with your input signal:
const convolver = audioCtx.createConvolver();
convolver.buffer = impulseResponseBuffer; // load an IR file
source.connect(convolver);
convolver.connect(audioCtx.destination);
The impulse response buffer describes the acoustic space. Shorter buffers give tight room reverb; longer ones give large hall sounds. You can find free impulse response libraries online for cathedrals, studios, and outdoor spaces, or record your own with a hand clap in the target room.
Gotchas
Audio context state. Browsers suspend the audio context automatically when no interaction has happened. If your audio does not play, check audioCtx.state and call resume().
Garbage collecting nodes. When you disconnect a node, it may be garbage collected if nothing else references it. Keep references to active nodes.
Sample rate. AudioContext.sampleRate gives you the hardware sample rate. All nodes operate at this rate. You cannot mix contexts with different sample rates.
Mobile autoplay. On iOS Safari, audio playback requires a user gesture. Call audioCtx.resume() inside a click or touch handler before any sound plays.
Build the signal chain first
Start by sketching the flow of audio before you write code. A source, a gain stage, a filter, and a destination are often enough to map the whole effect. When the chain is clear, it is much easier to decide where to insert analysis or modulation nodes later.
Schedule changes ahead of time
Audio sounds better when level and pitch changes are scheduled instead of changed at the last instant. Smooth ramps avoid clicks and give the browser time to apply the change cleanly. This is especially important for instruments, envelopes, and fade transitions where the ear notices sudden jumps.
Mix live input with playback
A microphone, a file, and a synthesizer can all live in the same graph. That makes the API useful for more than playback: you can process live speech, blend it with a backing track, or feed it through effects before it reaches the speakers. The graph model keeps those combinations understandable.
Pick the simplest node that works
Not every effect needs a custom processor. Many tasks are already covered by built-in nodes like gain, filter, panner, and analyser. Start with the smallest node that matches the sound you want. That keeps the graph easier to tune, easier to debug, and easier to explain to the next person who reads the code.
Plan the mix first
Before you connect nodes, decide where the signal should go and what each stage should change. A clear plan helps you avoid extra nodes and makes it easier to reason about gain, filter, and analyser roles. The graph can grow later, but the first pass should stay simple.
Keep real-time work small
The audio thread needs predictable code. Prefer short processing paths, avoid allocations inside hot loops, and move expensive setup outside the callback. That keeps playback steady and reduces the chance of glitches when the browser is under load.
Keep audio graphs predictable
A predictable graph is easier to tune and easier to extend. When each node has one role, it becomes clear where to adjust gain, where to shape tone, and where to inspect the result. That helps you debug sound issues without having to untangle a huge chain first.
Treat changes as scheduled events
Audio changes sound better when they are planned ahead. Scheduling a fade or filter move gives the browser time to apply the change without a click or pop. That same habit also makes effects easier to repeat, which is useful for instruments, triggers, and timed transitions.
Leave room for tuning
A useful audio graph is one you can still adjust. If every node is packed tightly into one hard-coded chain, later changes become awkward. Leave some space in the design for another filter, a different source, or a cleaner fade path. That flexibility pays off quickly when the sound direction changes.
See Also
- /guides/javascript-webgl-basics/: WebGL for 3D graphics; combine with Web Audio for game audio
- /guides/javascript-streams-api/: the Streams API works alongside Web Audio for chunked audio processing
- /guides/javascript-canvas-api/: visualise audio data by drawing analyser output on canvas
- /guides/javascript-web-workers/: Web Workers run code off the main thread, similar to how AudioWorklet keeps audio processing on its own thread