Optimizing filtered collection getters with prefix tree structure

Imagine a JavaScript program that needs to render at 60 frames per second, and between each frame has to filter a huge collection to find the objects to use in the next call, without ever dropping a single frame.

One solution to this problem could be prefix tree indexing, sometimes known as trie structure. Put simply, an ordered data structure where the keys of a nested object is used to define the path to one or several models.

Setting up the premise

Let's say we have a Note model, extending ampersand state:

const Note = State.extend({
  props: {
    bar: 'number',
    beat: 'number',
    tick: 'number'
  }
});

And to keep our notes, we create a collection from ampersand collection:

const Notes = Collection.extend({
  model: Note
});
const notes = new Notes();

Suppose we fill the notes collection with notes spread over different bars, beats and ticks. On each frame, we must then construct a filtered collection containing only the notes that belongs to the current position and hand that off to an audio engine, while we maintain 60 fps framerate.

One way of doing it would be to simply filter through all models, and create a subcollection of the ones that match our truth test:

const relevant = notes.filter(function (note) {
  return note.bar === 1 && note.beat === 2 && note.tick === 49;
});

This works well with a small collection, but when we imagine a real sequencer application, keeping track of hundreds or thousands of notes, the filtering is simply too slow to keep up 60 fps.

So, let's instead index the models and use an optimized getter.

Managing the indices

First, we'll update the collection constructor we just created with an index object and methods to update the index when models are added, removed or changed.

const Notes = Collection.extend({
  model: Note,
  initialize() {
    Collection.prototype.initialize.apply(this, arguments);
    this.index = {};
    this.on('add', this.addIndex);
    this.on('remove', this.removeIndex);
    this.on('change', this.updateIndex);
  }
});

Next, we need to add an addIndex method to the collection:

addIndex(note) {
  const index = this.index;
  const bar = note.bar;
  const beat = note.beat;
  const tick = note.tick;

  const barIndex = index[bar] = index[bar] || {};
  const beatIndex = barIndex[beat] = barIndex[beat] || {};
  const tickIndex = beatIndex[tick] = beatIndex[tick] || [];
  tickIndex.push(note);
}

Similarly, we need to remove the model from the index object when it is removed from the main collection.

removeIndex(note) {
  const index = this.index;
  const bar = note.bar;
  const beat = note.beat;
  const tick = note.tick;

  const tickIndex = index[bar][beat][tick];
  const noteIndex = tickIndex.indexOf(note);
  tickIndex.splice(noteIndex, 1);
}

And when the note model is updated, we simply run both methods to remove and re-add it in its new position in the tree structure.

updateIndex(note) {
  this.removeIndex(note);
  this.addIndex(note);
}

Adding a getter

Now that we have an indexed tree structure, we just need a method to the collection that returns the relevant models.

findByPosition(bar, beat, tick) {
  const index = this.index;
  const barIndex = index[bar] || {};
  const beatIndex = barIndex[beat] || {};
  const tickIndex = beatIndex[tick] || [];
  return tickIndex;
}

So, we can now do this:

const relevant = collection.findByPosition(1, 2, 49);

But let's say we also want a getter for using with an incomplete position, for example only bar and beat. With the trie structure, this is trivial to add. We simply change the findByPosition method to check the arguments and when possible, return early.

findByPosition(bar, beat, tick) {
  const index = this.index;
  const barIndex = index[bar] || {};
  if (!beat) return barIndex;
  const beatIndex = barIndex[beat] || {};
  if (!tick) return beatIndex;
  const tickIndex = beatIndex[tick] || [];
  return tickIndex;
}

We could then get all notes for a beat or even a full bar:

const beatNotes = collection.findByPosition(1, 3);
const ticks = Object.keys(beatNotes);

Summary

This was one possible solution that will scale, performing extremely well also with huge collections of notes, while still allowing a flexible getter.

It's important to remember that the object or array that our method is returning is a reference to a node in the cache, and modifying it would change the cache object directly.

Also, if there is no need to create collections with only part of the full index key, it would be simpler to keep a flat index using bar, beat and tick concatenated into a string as key.

Finding the right structure for your data means first understanding how it is shaped and how you need to access it, and you might not get it right initially. As the saying goes: if at first you don't succeed, trie, trie again.