Modernizing MML for 2022 (3) leveraging audio plugins capabilities

part 1 / part 2 / part 3 / part 4 / part 5

From chips to audio plugins

(There is nothing infromative here if you are not interested in history or rationale, so skip to the next section.)

MML processors have no standardized destination. Their destination devices can be either PSG chips, FM chips, or MIDI devices, and the output data was not direct I/O instructions over those sound hardware. And it is only MIDI which has a standardized instruction set (of MIDI messages). MIDI was therefore appropriate for DAWs when they implement "sequencer" part of its workload. Even when we control FM or PSG emulators, we mostly use MIDI unless we want to play old sequence files or reuse the old toolchains from 20th. century for authoring music.

But when it comes to instruments, MIDI has poor expressiveness. It can at best represent a Program Change (C0h..CFh) message. A program can be just a 7-bit integer, or even with "bank select" control change messages, it is 14-bit extensions. A program might indicate a GM (General MIDI) instrument (it might not), which only indicates an instrument name, not actual audible artifacts.

Audio plugins goes far beyond. We can play arbitrary samples or synthesized sounds from scratch with a lot of modifications at play time. The actual instruments are not commonly standardized but they are commonized at plugin framework level so that arbitrary hosts (e.g. DAWs) can control arbitrary audio plugins. Instead they have to be set up (installed) at playing environments, which does not happen often anymore. People only share the recording results and open tips and knowledge base as data are lost.

Anyhow people mostly use audio plugins for instruments nowadays. Fortunately the way how audio plugins are used (instantiated and controlled) are commonized per plugin framework. Also, there are fortunately instruments that are designed for open standards, as well as quality open source synthesizers that would run almost everywhere appropriate, so open source enthusiasts (like me) can also depend on them.

What we can control over audio plugins

Typical DAWs assign a "filter chain", "filter graph" or "audio plugin graph" for each track. A filter chain or an audio graph is a graph of audio plugin nodes i.e. chain of one or more audio plugin instance descriptors. Usually a simple array of plugin descriptors would suffice, unless you would like to support complicated filtergraphs.

Each audio plugin framework specifies their own ways to control a plugin, but here are commonly declared interfaces:

plugin parameters: each of them is for a (usually) 32-bit float value (sometimes 64-bit). They are controllable wherever in a sequence.
plugin state (preset) binaries: that plugin loads and saves for its own customized states that are not in the form of parameters. They are usually fixed through a sequence (track)

Plugin parameter information can be programmatically retrieved by plugin hosts. That is, once we collected the plugin information, we can programmatically control them, if we can express them in our MIDI sequence. On DAWs it is typically done through MIDI mappings.

On the other hand, plugin preset/state binaries are solely produced by the plugin itself and they are just binary chunk that we have no way to inspect the internals. Since plugin preset/state binaries are commonly used, we have to be able to put them into the final music (sequence) data file. We discuss this later, but let's note that we have no control over them for now.

Defining and editing audio plugin graphs

Since we have no control over the state binaries, what can we do for them? Since they are solely generated by the plugins, we could at best launch the plugin (UI) and let it generate the binaries that song authors configured. Since state binaries must be independent of host, we can use any host application that generates readable isolated state binaries from any other parts that the host saves as its own data.

JUCE AudioPluginHost was therefore chosen for my project so far. It can generate a DAG for audio plugins as well as audio in/out channels and MIDI in/out channels as terminals, and can save those connections as XML. I guess it can be VST3TestHost, Element, Carla etc. either, but for the first step I picked up the one I usually deal with.

JUCE AudioPluginHost

As an extra step, we have to "scan the installed audio plugins" first on AudioPluginHost, but that would be necessary to any other audio plugin hosts (if you usually don't do that, then your DAW automatically runs it, which could often take too long time to initialize, which AudioPluginHost avoids). It causes another problem later, but I won't explain it this time...anyways.

It saves the connection states as *.filtergraph file which is just an XML whose content looks like:

filtergraph XML content excerpt

I have created a bunch of simple example filtergraphs as augene-ng/samples/Banks so that I can easily reuse those settings in my new MML song files. (Ideally I should come up with demo music but then I have to become a productive composer who can compose songs that can be directly commit to the git repo...)

Referencing audio plugins from MIDI files

My MML compiler targets SMF or SMF-alike for MIDI 2.0, and MIDI does not have a concept for audio-plugins. MIDI 1.0 has 16 channels and each of them can be assigned a "program" which can be set by a program change message, on any track. A program change is just a number, so definitely not to store a string identifier. Audio plugins are identified either by a filename or some identifier string, so if we want to indicate audio plugins to use, we need something else than Program Changes.

To include audio plugin settings, I use a "project" file that can reference them along with the MML files. I could design the model like, the plugins could be indicated by integer (index) and be referencible via Program Change messages, but it's not intuitive - I would have to look up the mapping in the project file to figure out which number is for which instrument. Therefore I used string identifier instead. To express some string values in SMF, we have few choices:

System exclusive messages
Meta events (in SMF specification, not as MIDI messages)

In SMF there is a standard meta event called "INSTRUMENT NAME", which would exactly match my use, so I use it to indicate the filtergraph.

Also, I didn't want to identify plugins as in file paths to the *.filtergraphs, I assign an ID for each filtergraph in the project *.augene file. Thus my MML looks like this:

// ---- Violoncellos -----------------------------------------------------------
_ 30 CH30 K12
        TRACKNAME "Violoncellos" INSTRUMENTNAME "sfizz_vpo3_cello_sec_ks;simple_reverb"
        ...

and those IDs are defined in the project *.augene file like:

    <AudioGraph Id="sfizz_vpo3_cello_sec_ks" Source="sfizz-vpo3-cello-sec-ks.filtergraph" />

I could set up one filtergraph to contain all the plugins I want to use in the track, but then each filtergraph file won't be "reusable" in other tracks or songs. Thus I define them split, and made the MIDI2-toTracktion converter to support multiple filtergraphs when parsing INSTRUMENTNAME meta events.

And I don't really want to specify those plugins every time, so I create a "bank" project file and just include it in my project and support <include> feature in the *.augene project format.

It feels like a bunch of workloads and DAWs can handle things in much simpler way! It feels like I set up console tools to build a complicated application project for complicated platforms (like UWP or Android) without resorting to existing IDEs. But once I set up everything, it gives me more control. And probably what's more important is, "portability".

Automation from MML (1) understanding what should be generated

Assigning audio plugins to each track is just one part of the concern. We should be able to control plugin parameters at any time in the music sequence, and our MML should support it as it is MML which defines the sequence. In the end, we want to write MML like this:

// ---- Gong -------------------------------------------------------------------
#macro 23 GONG len:length { n46,$len }
_ 23 CH9 K2
        TRACKNAME "Gong" INSTRUMENTNAME "sfizz_vpo3_percussion_misc;simple_reverb"
        [GONG64]8 SIMPLEREVERB_FREEZE 1 [GONG8 (16 GONG8 (8 GONG8 GONG8 )24]39 [GONG8]3

This instructs simple_reverb audio plugin to assign 1 to its "FREEZE" parameter (kind of, I'm not very precise here but you'd get the point in general).

How are the plugin parameters controlled on Tracktion Waveform DAW? You first create an "automation track", and assign a parameter to control. Then you can draw the value changes there. When it is saved in the *.tracktionedit file, the saved locations are split into <AUTOMATIONTRACK> element in the <TRACK> element and <AUTOMATIONCURVE> element in the <PLUGIN> element in the <TRACK> element, which may feel weird and complicated but I would skip explaining for now. What is important here is that we now know we can generate them.

    <AUTOMATIONTRACK currentAutoParamPluginID="-434581932" currentAutoParamTag="4" id="-815843367">
      <MACROPARAMETERS id="558146789" />
      <MODIFIERS />
    </AUTOMATIONTRACK>
    ...
    <PLUGIN uid="52b9494c" name="sfizz_vpo3_percussion_misc" id="-434581932" enabled="0" programNum="0" volume="0.0">
      <AUTOMATIONCURVE paramID="4">
        <POINT t="0.5" v="1.0" c="0.0" />
        <POINT t="200.0" v="0.0" c="0.0" />
      </AUTOMATIONCURVE>
    </PLUGIN>

The plugin is identified by uid attribute (name attribute is what I assigned so not for Tracktion Engine internals), and the target parameter is identified by paramID attribute. We don't know what parameter it is, but we know what we should generate.

Automation from MML (2) reference plugin parameter by name, not index

In the previous section, we could already reference a plugin by identifier. But it seems we need different identification mechanism for uid. Also, we still don't know what kind of parameters those plugins have but we would like to when authoring MML. We MML authors have no idea what "parameter #4" means.

How does Tracktion Engine map the uid of the plugin and the actual corresponding plugin? It seems a plugin has a unique identifier uid, and it is managed per plugin framework. In JUCE world, a juce::PluginInformation holds a uniqueId integer, and it seems provided by each plugin framework. A loadable plugin would be dependent on the plugin search paths, but the plugin ID seems unique per plugin framework on the local machine, across applications.

For JUCE AudioPluginHost, we had to scan available plugins first. It was to collect plugins information, and each plugin entry actually contains the UID. We could do the same for the MML compiler. Therefore, I added "Export Plugin Metadata" feature on our "AugenePlayer" app which is just a simple player for tracktionedit file:

Export Plugin Metadata

The app also has plugin list settings (via "Plugins" button), and "Export Plugin Metadata" generates a big plugin list metadata like this:

plugin-metadata.json excerpt

Then we can run a simple MML generator tool that generates tailored macros for each audio plugin:

automation support MMLs generated by tool

And each MML file looks like:

// generated by generate-automation-helper.js
#macro AUDIO_PLUGIN_USE nameLen:number, ident:string {  __MIDI #F0, #7D, "augene-ng", $nameLen, $ident, #F7 }
#macro AUDIO_PLUGIN_PARAMETER parameterID:number, val:number { \
    __MIDI #F0, #7D, "augene-ng", 0, \
    $parameterID % #80, $parameterID / #80, $val % #80, $val / #80 } 

#macro ADELAY { AUDIO_PLUGIN_USE 10, "2004593855" }
#macro ADELAY_BYPASS val { AUDIO_PLUGIN_PARAMETER 0, $val }
#macro ADELAY_DELAY val { AUDIO_PLUGIN_PARAMETER 1, $val }

Then we can simply use those macros without looking up the parameters from plugin information by index. Now we achieved what we wanted in the first place! Now you see how SIMPLEREVERB_FREEZE works:

        [GONG64]8 SIMPLEREVERB_FREEZE 1 [GONG8 (16 GONG8 (8 GONG8 GONG8 )24]39 [GONG8]3

We still have some tooling glitches to make it useful enough to include those generated MMLs yet but we would focus on its potential first. It should be noted that, it is important to generate and use the macros these MML files contain, for data portability. Your plugins may not be at the same location on different machines, or have different plugin UIDs, but as long as the macro names match, MMLs that were written elsewhere would still work on your current machine too.

For parameters that we want to use for linear changes (like pan, modulation, or pitchbend) we can define relative change macros (e.g. SIMPLEREVERB_SIZE+ and SIMPLEREVERB_SIZE-) and "spectra" macro (e.g. SIMPLEREVERB_SIZE_) like our default-macro.mml does for some existing operators.

future: referencing audio plugin by identifier, not path

Phew, it was a long post. I'm going to finish this chapter with a remaining concern (across many actually, but I'd focus on one).

A potential issue around here is that those *.filtergraph files store the plugin information as in both file paths and "uniqueId" (which I would mention in more depth later) and the "uniqueId" might not be used when loading plugin instances on other computers. That would prevent audio plugin portability and hence MML portability, which is one key element that I would like to achieve. Ideally we would like to install and identify the same plugin either by building from source, installing by packages from official builds, distro packages, or services like StudioRack and have everything still working consistently.

Hot Reloading on playback engine

One of the most annoying problems I encountered on the whole project was that loading a *.tracktionedit on tracktion_engine::Edit takes too long time while editing a music file. It is mostly for loading all the plugins in use. One of my demo MML contains 30 tracks with just JuicySFplugin and SimpleReverb for each, and I used 2GB .sf2 files for those tracks, which took like 30+ seconds on Waveform11 every time I load. It is not usable at all. Later I created another version using sfizz and .sfz which significantly reduced load time (which I believe is a file structure issue, not about software performance), but it is still a thing.

Since we don't frequently change the audio plugin settings part, I made a dirty hack that when we "reload" the same file (which is, loading the file from the same path) we only alter the MIDI clips without discarding existing edits. The load time went less than 1 sec. I call it "Hot Reload" which is quite common technique on building apps for complicated platforms. It's super hacky so far but I could survive with it so far. It is specific to my AugenePlayer app though. If I seriously need shorter time and have to use Waveform DAW, I can still use conditional compilation features implemented in my MML compiler.

Any MML compiler tools that generate a complete set of music files would suffer from this problem as well, and hot reloading technique would save you too.