GetDunne Wiki

Notes from the desk of Shane Dunne, software development consultant

User Tools

Site Tools


sarah

SARAH: FFT-based synthesizer plugin

SARAH is an improved and expanded version of VanillaJuce which uses the new DSP classes provided in JUCE 5.1 to address the problem of oscillator aliasing. It works by using juce::dsp::FFT to transform mathematically-perfect oscillator waveforms, zeroing out unwanted high-frequency harmonics, and reverse-transforming to produce perfectly band-limited wave tables. Just for fun (and because I wanted to know if it was even possible without killing the CPU), it also implements simulated low-pass filtering (which I call Harmonic Shaping) in the frequency domain.

Because of the use of the Harmonic Analysis, via the Fast Fourier Transform, and because Fourier himself was French, and because the really great name Harmor was already taken, and because there's a fabulous singer-songwriter in my hometown whose name is Sarah Harmer, I decided to use the name SARAH, and let it stand for synthèse à rapide analyse harmonique, or “synthesis by fast harmonic analysis”.

I don't suggest the FFT-based approach is in any way superior to more conventional time-domain synthesis approaches. Simplicity is probably its only virtue. I present SARAH as nothing more than an example of how to incorporate one small aspect of the new JUCE 5.1 DSP library into a plugin, present some early ideas about oscillator anti-aliasing, and show how they can be extended to provide something like filtering without the use of conventional digital filter algorithms.

You can find the SARAH source-code (published under the GPL 3.0 license) on GitHub at https://github.com/getdunne/SARAH.

Basic signal flow

Click on the following diagram for a larger image.

The two oscillators OSC1 and OSC2 are the same in terms of how they are controlled. In addition to the “waveform”, “pitch” and “detune” settings (not shown above), each has one control-input for pitch and one for harmonic shaping. The pitch-control signal is the sum of the Pitch LFO output plus that oscillator’s Pitch EG output. The shaping signal is the sum of the shape LFO output plus that oscillator’s Shape EG output scaled by the “envelope amount” setting. There is only one Pitch LFO and only one Shape LFO, which affect both oscillators; all other modules are duplicated.

The OSC1 and OSC2 outputs are mixed according to the “Blend” setting. The mixed output is scaled by the sum of the Amp EG output, plus the “Master volume” setting, and split out evenly (no panning) to the Left and Right audio outputs.

About SARAH's oscillators

The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the signal-flow diagram above (Envelope Generators, LFOs, summing and scaling) are entirely conventional. The following is a quick summary; for details see SARAH oscillator details.

SARAH's oscillators are wave-table based. They play out samples from a 1024-element digitized representation of one cycle of the selected waveform—sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated.

There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing.

The triangle, square, and sawtooth wave tables are generated dynamically, as follows:

  1. 1024 samples of each waveform (one cycle) are generated once, using exactly the same mathematical expressions used in the LFOs, resulting in mathematically “exact” waveforms having 512 harmonics.
  2. Each mathematically-exact waveform is transformed using juce::dsp::FFT to produce a frequency-domain representation, which is a new 1024-element array, where each element (“coefficient”) represents the relative amplitude and phase of all 512 harmonics. These three “frequency-domain” tables (one each for triangle, square, and sawtooth wave shapes) are kept in memory.
  3. Each oscillator instance has its own 1024-element array. In preparation to sound a note, it copies the coefficient data out from the appropriate common frequency-domain table to its own array, then performs an in-place inverse FFT to reconstruct the appropriate time-domain wave table. To prevent aliasing, only the coefficients for harmonics below the Nyquist frequency (one-half the sampling rate) are copied; higher-order coefficients are set to zero.

Harmonic Shaping

Zeroing (or changing in any way) the coefficients of a frequency-domain signal and then performing an inverse FFT is basically a kind of filtering. After I had implemented the dynamic wavetable reconstruction algorithm described above, it occurred to me that I could easily apply some kind of frequency-response curve to the non-zero harmonic components, to obtain a similar effect as running the oscillator output through a conventional filter. This requires many more inverse FFT operations (because it must be done dynamically, to simulate a time-varying filter cutoff), but I found that on modern PC and Mac hardware, this is not impractical.

Because SARAH is using coefficient adjustments and the inverse FFT to simulate filtering, I decided to call this Harmonic Shaping rather than “filtering”. I'm not at all convinced this is a sensible alternative to conventional (time-domain) digital filtering, but it is certainly interesting, and allowed me to implement simple time-varying timbres (using both an envelope generator and an LFO) with just a few simple changes to the existing anti-aliased oscillator code.

The details

See this page for the gritty details of SARAH's oscillator implementation.

Skinning

As of 29 September, 2017, SARAH uses a single-view GUI instead of the multi-tabbed approach inherited from VanillaJuce. See this page for some thoughts on the two approaches.

sarah.txt · Last modified: 2017/09/30 15:38 by shane