Making a boombap beat with Dilla and the Web Audio API

An introduction to composing music with code, using the browser as sequencer runtime and Javascript to schedule notes with precise rhythmic timing.

The Web Audio API (WAA) provides low-level building blocks for wiring up simple or advanced audio graphs, but offers no help when it comes to rhythm-based playback of notes or automation.

Enter Dilla, a small WAA scheduling library based on 96 ticks per beat, where note positions are represented as bar.beat.tick. If you have ever used an MPC or similar sequencer, the rhythm and position signatures will feel very familiar.

In this guide we'll use WAA and Dilla together to program a minimalist boombap beat, covering all basics on how to use an HTML document as a beatmaking environment: loading and playing sounds, rhythmic and precise scheduling, simplistic mixing with reverb and compressor, and parameter automation.

In case you want to skip ahead, you can find the complete code and working demo on GitHub. Or check out the snippet below:

Setting up the environment

For brevity, this tutorial assumes that you are familiar with node.js and npm. Before we start programming the beat, we'll need a build step to bundle our Javascript. Dilla uses the CommonJS module format, and you should use browserify to concatenate your script and its dependencies into one file.

Next up, you'll need to run a simple static server - simply loading the HTML file with file:// won't work because of issues with CORS when fetching the sound files via XHR. I'd go with 3w or asimov-static.

If any of this is new or confusing to you, check out this example.

Digging up some sounds

Next, you'll need to dig up or download some sounds. These are lifted from my crates, provided here for educational purposes.

Then we need to install Dilla, by typing npm install --save dilla in the terminal.

Okay, with some samples selected and Dilla installed, we're finally ready to start coding and composing.

Initializing Dilla

The Dilla module exports a constructor function, to which we can pass tempo, beats per bar and bars per loop as options. For a boombap beat, the default 4 beats per bar and 2 bars per loop works fine, so we'll just change the tempo to 88 bpm.

var Dilla = require('dilla');
var Context = window.AudioContext || window.webkitAudioContext;
var audioContext = new Context();
var dilla = new Dilla(audioContext, {
  'tempo': 88
});

Loading sounds

We'll need an asynchronous function that can load a sound file over XHR, decode it and save a reference for later use.

// Will store our decoded sound buffers
var sounds = {};

// Load a sound file and decode its data
function loadSound (name, done) {
  var request = new XMLHttpRequest();
  request.open('GET', 'sounds/' + name + '.wav', true);
  // Make sure to set the response type to "arrayBuffer"
  request.responseType = 'arraybuffer';
  request.onload = function soundWasLoaded () {
    audioContext.decodeAudioData(request.response, function (buffer) {
      sounds[name] = buffer;
      done();
    });
  };
  request.send();
}

Create an array with the names of the sounds we want to load. For now, we'll start with the drum sounds and add the rest later.

var soundNames = [
  'kick', 'snare', 'hihat'
];

Then add a recursive function that gets the name of the next sound to load, and when all sounds are decoded and ready, starts the playback.

function loadNextSound () {
  var soundName = soundNames.shift();
  if (!soundName) return dilla.start();
  loadSound(soundName, loadNextSound);
}

Now, to start the clock that runs Dilla, we just need to load the sounds, one by one, and wait for the array of names to be empty.

loadNextSound();

Programming drums

To compose the beat, we'll use dilla.set(id, notes) to define an array of notes for the channel with id "kick".

A note is an array with two values: start position, which is required, and an object with any other parameters, all optional. Below we're also defining gain, but only for some notes - this is because we'll default this attribute to 1 in the function that plays the sound buffer.

If we define a duration attribute, the length of the note in ticks, we will later get step events both for starting and stopping the sound. But if undefined, the note is considered to be a oneshot, and we'll always play the whole buffer, which is perfect for our short drum samples.

dilla.set('kick', [
  ['1.1.01'],
  ['1.1.51', { 'gain': 0.8 }],
  ['1.2.88'],
  ['1.3.75'],
  ['1.4.72', { 'gain': 0.8 }],
  ['2.1.51', { 'gain': 0.7 }],
  ['2.3.51', { 'gain': 0.8 }],
  ['2.3.88']
]);

Don't worry too much about the exact timing of the notes above - this is my personal preference for drums, with somewhat lazy kicks, and as you'll see below, mostly eager snares. But as with all things creative, anything goes, and everyone has their own taste.

dilla.set('snare', [
  ['1.1.91'],
  ['1.3.91'],
  ['2.1.91'],
  ['2.4.03']
]);

Next up is the hihat, and here we'll do something different. The loop should be the same for both bars, and to avoid re-typing we can use expressions instead of flat positions.

dilla.set('hihat', [
  ['*.1.01', { 'gain': 0.7 }],
  ['*.2.01', { 'gain': 0.8 }],
  ['*.3.01', { 'gain': 0.7 }],
  ['*.4.01', { 'gain': 0.8 }],
  ['*.4.53', { 'gain': 0.6 }]
]);

Playing sounds

So, we've got a drum track with some notes, and playback of the loop starts when all sounds are loaded, but we're still not hearing anything. It's time to actually start playing those sound buffers we decoded before.

Shortly before a note begins or ends, an event called step is triggered. It contains an offset time, relative to the audio context's current time, so we can start or stop specific sounds. It also contains any note parameters we may have defined before, in step.args.

We'll implement a simple function to handle starting the oneshot drum sound, via a gain node to control the volume. Since we didn't define duration for our notes so far, we'll only get step events to start playback, so we can wait to care about stopping sounds for later.

dilla.on('step', function onStep (step) {
  var source = audioContext.createBufferSource();
  source.buffer = sounds[step.id];
  var gainNode = source.gainNode = audioContext.createGain();
  var gainVolume = step.args.gain || 1;
  source.connect(gainNode);
  gainNode.connect(audioContext.destination);
  gainNode.gain.value = gainVolume;
  source.start(step.time);
});

With this, our beat finally makes some noise. At this stage, you should listen and tweak the note positions and gain value, until your head can't stop nodding to the rhythm.

Effects on master out

Before we get started with the melody of samples, we should add two simple effects on the master out channel: a high-pass reverb, also known as exciter, and a compressor. The parameters for these are completely arbitrary - my preferences - and you'd be well off using your ears while tweaking these numbers until it sounds good to you.

var compressor = audioContext.createDynamicsCompressor();
compressor.threshold.value = -15;
compressor.knee.value = 30;
compressor.ratio.value = 12;
compressor.reduction.value = -20;
compressor.attack.value = 0;
compressor.release.value = 0.25;

We get a dynamics compressor out of the box from WAA's audio context, but a reverb is a bit more code to setup. To keep it simple, we'll use soundbank-reverb by Matt McKegg, a talented JS hacker who has written many useful WAA modules.

var Reverb = require('soundbank-reverb');
var reverb = Reverb(audioContext);
reverb.time = 1;
reverb.wet.value = 0.1;
reverb.dry.value = 1;
reverb.filterType = 'highpass';
reverb.cutoff.value = 1000;

Next, we need to hook our effects into the audio graph.

compressor.connect(reverb);
reverb.connect(audioContext.destination);

Finally, we need to change one line in our onStep function, to connect the gain node to the compressor.

gainNode.connect(compressor);

Adding the "plong" samples

It's time to start building the melody, and first up are our "plong" samples. We'll need to add them to the list of samples to load.

soundNames.push('plong1', 'plong2');

We'll make the programming very simple, and again, go ahead and adjust this with your own flavor and style.

dilla.set('plong1', [
  ['1.1.01', { 'duration': 95 }]
]);
dilla.set('plong2', [
  ['1.4.90', { 'duration': 60, 'gain': 0.4 }],
  ['2.1.52', { 'duration': 60, 'gain': 0.7 }]
]);

Notice that we've defined duration, measured in ticks, for each note. In other words, these are not oneshots, and because of this we'll need to update the onStep function to also handle stop step events.

We're also going to add a very simplistic attack/release strategy, to avoid ugly cracks and pops when starting and stopping these sounds.

Let's change the onStep function to call onStart or onStop, which we'll implement next.

dilla.on('step', function onStep (step) {
  if (step.event === 'start') onStart(step);
  if (step.event === 'stop') onStop(step);
});

The onStart function will look similar to our previous onStep, but if the note has duration defined, we apply a very quick fade in. We'll also save a reference to the buffer source we create, so that we can stop it in the onStop callback.

function onStart (step) {
  var source = audioContext.createBufferSource();
  source.buffer = sounds[step.id];
  var gainNode = source.gainNode = audioContext.createGain();
  var gainVolume = step.args.gain || 1;
  if (step.args.duration) {
    gainNode.gain.setValueAtTime(0, step.time);
    gainNode.gain.linearRampToValueAtTime(gainVolume, step.time + 0.01);
  } else {
    gainNode.gain.value = gainVolume;
  }
  source.connect(gainNode);
  gainNode.connect(compressor);
  source.start(step.time);
  step.args.source = source;
}

In the stop handler, quickly fade out the sound. Listen and tweak the attack and release speed as you see fit.

function onStop (step) {
  var source = step.args.source;
  var gainVolume = step.args.gain || 1;
  source.gainNode.gain.setValueAtTime(gainVolume, step.time);
  source.gainNode.gain.linearRampToValueAtTime(0, step.time + 0.01);
}

Continuing with strings

Next piece of the melody is a few string samples, arranged with different gain values to create a delay-like pattern. The start and stop step callbacks are already ready to handle this. So, we just make sure the sounds are also loaded and then register the notes.

soundNames.push('string1', 'string2', 'string3');
dilla.set('string1', [
  ['1.3.75', { 'duration': 90, 'gain': 0.6 }],
  ['1.4.52', { 'duration': 90, 'gain': 0.2 }],
  ['2.3.25', { 'duration': 70, 'gain': 0.6 }],
  ['2.4.01', { 'duration': 85, 'gain': 0.3 }],
  ['2.4.75', { 'duration': 85, 'gain': 0.1 }]
]);
dilla.set('string2', [
  ['2.2.50', { 'duration': 70, 'gain': 0.6 }]
]);
dilla.set('string3', [
  ['1.2.05', { 'duration': 45, 'gain': 0.6 }],
  ['1.2.51', { 'duration': 45, 'gain': 0.4 }],
  ['1.3.05', { 'duration': 45, 'gain': 0.2 }],
  ['1.3.51', { 'duration': 45, 'gain': 0.05 }],
  ['2.2.05', { 'duration': 45, 'gain': 0.6 }]
]);

Balance with bass

Finally, we want to add a counterpart for all that plong and string treble, with a deep bass rhythm.

Instead of loading one sound for each note, we'll re-use the same sample by changing the playback rate, i.e. pitch, of the sound buffer.

Add this line to the onStart function:

source.playbackRate.value = step.args.rate || 1;

Then we add the sound name so it gets loaded, and register the notes. Props to Raymond May Jr for helping with the bass programming.

soundNames.push('bass');
dilla.set('bass', [
  ['1.1.01', { 'duration': 60, 'gain': 0.8, 'rate': 0.55 }],
  ['1.2.72', { 'duration': 15, 'gain': 0.5, 'rate': 0.55 }],
  ['1.3.02', { 'duration': 40, 'gain': 0.8, 'rate': 0.55 }],
  ['1.4.01', { 'duration': 40, 'gain': 0.6, 'rate': 0.64 }],
  ['1.4.51', { 'duration': 100, 'gain': 0.8, 'rate': 0.74 }],
  ['2.3.51', { 'duration': 60, 'gain': 0.8, 'rate': 0.46 }],
  ['2.4.51', { 'duration': 40, 'gain': 0.8, 'rate': 0.52 }]
]);

Wrapping up

We've made a minimalist boombap beat with some funky drums, a few samples and a bass line, playing in the browser.

There are some obvious improvements that could be done here: control gain or add effects per channel, pan the samples to widen the soundscape, define attack and release per note, program longer loops and variations, or simplify the pitching of bass notes with a helper function. And lots more.

I'd love to read your feedback and hear your improved versions, so fork the repository, open a pull request and tweet at me.