GetDunne Wiki

Notes from the desk of Shane Dunne, software development consultant

User Tools

Site Tools


sarah

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
sarah [2017/09/05 21:19]
shane [About SARAH's oscillators]
sarah [2017/09/30 15:38] (current)
shane
Line 1: Line 1:
 ====== SARAH: FFT-based synthesizer plugin ====== ====== SARAH: FFT-based synthesizer plugin ======
-**SARAH** is an improved and expanded version of [[VanillaJuce]] which uses the new DSP classes provided in JUCE 5.1 to address the problem of oscillator aliasing. It works by using //juce::dsp::FFT// to transform mathematically-perfect oscillator waveforms, zeroing out unwanted high-frequency harmonics, and reverse-transforming to produce perfectly band-limited wave tables. Just for fun (and because I wanted to know if it was even possible without killing the CPU), it also implements simulated low-pass filtering in the frequency domain.+**SARAH** is an improved and expanded version of [[VanillaJuce]] which uses the new DSP classes provided in JUCE 5.1 to address the problem of oscillator aliasing. It works by using //juce::dsp::FFT// to transform mathematically-perfect oscillator waveforms, zeroing out unwanted high-frequency harmonics, and reverse-transforming to produce perfectly band-limited wave tables. Just for fun (and because I wanted to know if it was even possible without killing the CPU), it also implements simulated low-pass filtering (which I call //Harmonic Shaping//in the frequency domain.
  
 Because of the use of the Harmonic Analysis, via the Fast Fourier Transform, and because Fourier himself was French, and because the really great name [[http://www.image-line.com/plugins/Synths/Harmor/|Harmor]] was already taken, and because there's a fabulous singer-songwriter in my hometown whose name is [[https://en.wikipedia.org/wiki/Sarah_Harmer|Sarah Harmer]], I decided to use the name **SARAH**, and let it stand for //synthèse à rapide analyse harmonique//, or "synthesis by fast harmonic analysis". Because of the use of the Harmonic Analysis, via the Fast Fourier Transform, and because Fourier himself was French, and because the really great name [[http://www.image-line.com/plugins/Synths/Harmor/|Harmor]] was already taken, and because there's a fabulous singer-songwriter in my hometown whose name is [[https://en.wikipedia.org/wiki/Sarah_Harmer|Sarah Harmer]], I decided to use the name **SARAH**, and let it stand for //synthèse à rapide analyse harmonique//, or "synthesis by fast harmonic analysis".
Line 19: Line 19:
  
 ===== About SARAH's oscillators ===== ===== About SARAH's oscillators =====
-The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the signal-flow diagram above (Envelope Generators, LFOs, summing and scaling) are entirely conventional.+The oscillators are the only interesting aspect of SARAH's design. All of the other elements shown in the signal-flow diagram above (Envelope Generators, LFOs, summing and scaling) are entirely conventional. The following is a quick summary; for details see [[SARAH oscillator details]].
  
-SARAH uses two oscillator instances per voice. The oscillators themselves are identically structured, but their settings are independent, i.e., they can be set to produce different waveforms, detuned, and mixed so as to provide at least a basic set of options to create composite timbres, and each oscillator's inherent "harmonic shaping" can also be set up differently, to provide further control. The following explanation (which is just an overview) applies equally to OSC1 and OSC2. +SARAH's oscillators are wave-table based. They play out samples from a 1024-element digitized representation of one cycle of the selected waveform---sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated.
- +
-SARAH's oscillators are essentially wave-table based. They simply play out samples from a 1024-element digitized representation of one cycle of the selected waveform---sine, triangle, square, or sawtooth. What is interesting and new is how the 1024-element wave tables are populated.+
  
 There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing. There is a common sine wave table which is generated once and shared by all oscillator instances (including the LFOs in sine-wave mode). This is adequate, because the sine wave has no higher-order harmonics which need to be suppressed to avoid aliasing.
Line 29: Line 27:
 The triangle, square, and sawtooth wave tables are generated dynamically, as follows: The triangle, square, and sawtooth wave tables are generated dynamically, as follows:
   - 1024 samples of each waveform (one cycle) are generated once, using exactly the same mathematical expressions used in the LFOs, resulting in mathematically "exact" waveforms having 512 harmonics.   - 1024 samples of each waveform (one cycle) are generated once, using exactly the same mathematical expressions used in the LFOs, resulting in mathematically "exact" waveforms having 512 harmonics.
-  - Each mathematically-exact waveform is transformed using //juce::dsp::FFT// to produce a frequency-domain representation, which is a new 1024-element array, where each element ("coefficient") represents the relative amplitude and phase of all 512 harmonics. (The mathematics of the FFT are such that each element is a [[wp>Complex_number|complex number]] having real and imaginary components, and each harmonic other than the 0th and 512th is represented twice, for positive and negative frequencies.) +  - Each mathematically-exact waveform is transformed using //juce::dsp::FFT// to produce a frequency-domain representation, which is a new 1024-element array, where each element ("coefficient") represents the relative amplitude and phase of all 512 harmonics. These three "frequency-domain" tables (one each for triangle, square, and sawtooth wave shapesare kept in memory. 
-  - After the initial //forward// FFT operations, one copy of each of the resulting three complex, frequency-domain arrays (one each for triangle, square, and sawtooth waveformsis kept in memory, shared by all oscillator instances+  - Each oscillator instance has its own 1024-element array. In preparation to sound a note, it copies the coefficient data out from the appropriate common frequency-domain table to its own array, then performs an //in-place **inverse FFT**// to reconstruct the appropriate time-domain wave table. To prevent aliasingonly the coefficients for harmonics below the //Nyquist frequency// (one-half the sampling rate) are copied; higher-order coefficients are set to zero.
-  - Each oscillator instance has its own 1024-element complex array. In preparation to sound a note, it copies the coefficient data out from the appropriate common frequency-domain table to its own array, then performs an //in-place **inverse** FFT// to re-create the appropriate time-domain wave table. +
-  - To sound a note, the oscillator resamples its own 1024-element array (wave table). +
- +
-The interesting stuff happens at step 4, but to understand it we must first discuss step 5: When the oscillator is assigned a note frequency, the note's frequency in cycles per second (Hertz) is divided by the sampling rate in use (typically 44100 Hzto yield a ''float''-valued "phase increment" in //samples per cycle//. The oscillator also has a ''float''-valued "phase" variable, restricted to the range 0.0 to 1.0, where 0.0 represents the beginning of the cycle and 1.0 represents the end. Each time the oscillator generates a new sample, it multiplies the phase by 1024 and rounds the result, to obtain a wave-table index in the range 0-1023, plucks that sample out of its wave-table and outputs it, then adds the phase increment to the phase, ensuring the result "wraps around" if necessary, so it remains in the range 0.0 to 1.0.+
  
-When the phase-increment is exactly 1.0, the oscillator replays its wave-table exactly. At a sampling rate of 44100 Hzthis would happen at an oscillator frequency of 44100/1024 = 43.066 Hz. This is a bit below F1way down in the bottom octave of the piano range. Even lower notes result in phase-increment a bit lower than 1.0in which casewave-table samples are occasionally repeated, resulting in slight //quantization artifacts// which are not very noticeable at such low notes.+===== Harmonic Shaping ===== 
 +Zeroing (or changing in any way) the coefficients of a frequency-domain signal and then performing an inverse FFT is basically a kind of filteringAfter I had implemented the dynamic wavetable reconstruction algorithm described aboveit occurred to me that I could easily apply some kind of frequency-response curve to the non-zero harmonic componentsto obtain a similar effect as running the oscillator output through a conventional filter. This requires many more inverse FFT operations (because it must be done dynamicallyto simulate time-varying filter cutoff)but I found that on modern PC and Mac hardwarethis is not impractical.
  
-For all notes above F1---i.e., just about every note you'll ever play---the phase increment will be greater than 1.0, meaning that some wave-table samples will be skipped. Basically, you are trying to replay the basic 43 Hz note at the higher pitch, and so //all harmonics// of the original tone will be multiplied by the phase increment value. The 512th harmonic of 43.066 Hz is 43.066 x 512 = 22049.79 Hzjust below half the 44100 Hz sampling rate---the so-called //Nyquist frequency//. Playing F2 means phase-increment of around 2.0so all harmonics above the 256th would be above the Nyquist frequency and would be "aliased" to lower frequencies. Up near the top of the piano range, the aliasing of even very low-numbered harmonics (which have substantial energywill result in very noticeable aliasing artifacts.+Because SARAH is using coefficient adjustments and the inverse FFT to //simulate// filteringI decided to call this //Harmonic Shaping// rather than "filtering"I'm not at all convinced this is sensible alternative to conventional (time-domain) digital filtering, but it is certainly interesting, and allowed me to implement simple time-varying timbres (using both an envelope generator and an LFOwith just a few simple changes to the existing anti-aliased oscillator code.
  
-In SARAH, this is avoided at step 4 above, by figuring out the highest-numbered harmonic which will still be less than the Nyquist frequency, and setting all higher-numbered harmonic coefficients to zero. When the resulting table is inverse-FFT-transformed, we obtain a version of the original waveform which is almost perfectly //band-limited//, and plays back at the new rate with no aliasing whatsoever.+===== The details ===== 
 +See [[sarah_oscillator_details|this page]] for the gritty details of SARAH's oscillator implementation.
  
 +===== Skinning =====
 +As of 29 September, 2017, SARAH uses a single-view GUI instead of the multi-tabbed approach inherited from [[https://github.com/getdunne/VanillaJuce|VanillaJuce]]. See [[sarah_skinning|this page]] for some thoughts on the two approaches.
  
-(More to come...) 
  
  
sarah.1504646348.txt.gz · Last modified: 2017/09/05 21:19 by shane