Re: [fluid-dev] complexity of soundfont synthesis engine

Thanks so much gentlemen, that was very enlightening.

I think we will have a bit of a discussion of what level of sophistication we want for the sampler.

Thanks for highlighting that what we were thinking of doing is would not be doing the

soundfonts as originally created any justice. You saved us a lot of time.

Cheers,

Michael

From: S. Christian Collins <address@hidden>
To: address@hidden
Sent: Tuesday, September 20, 2011 1:44 PM
Subject: Re: [fluid-dev] complexity of soundfont synthesis engine

Hi Michael,

I have responded to your individual questions below:

On 09/19/2011 04:20 PM, Michael Geis wrote:

We were under the (probably naive) impression that all a sampler needs to do is loop over wave tables and apply envelopes. Seeing that the soundfont specification actually allows for greater complexity makes us wonder whether in order to play soundfonts, the sampler needs to be able to do all the things in the synthesis model. Unless there is a discrepancy between what the specification allows for and what most soundfonts look like in the wild. If a substantial fraction of soundfonts just loop over wave tables and apply envelopes, the sampler might still be useful for that subset of soundfonts if it just grabbed their wave tables and envelope parameters. The answer must be trivial for someone who has used soundfonts for a bit, I must admit it is not clear to me.

As a SoundFont designer, I personally make use of the majority of the features in the SoundFont spec to create my instruments. Playing just the exported samples from most of my SoundFonts will usually sound quite different than playing the instrument designed within the SoundFont bank. I can't speak for other SoundFont designers, but I would imagine the same to be more-or-less true for others as well.

Otherwise (i.e. if soundfont generally make full use of all the parts of the synthesis engine laid out in the spec), I see 2 options:
1.Implement the entirety of the synthesis model and use the parsed soundfont parameters as input (That sounds like reimplementing a lot of what fluidsynth already does).
2.Play the soundfonts via fluidsynth and record the output. The sampler then loops over that output and applies envelopes. Does 2. even make sense or is it likely to mangle the sounds? If it is reasonable, how many notes should I have for each pitch? One per pitch or one per envelope phase (i.e. 5 for the DAHDSR envelopes since delay doesn't make a sound) per pitch?

IMO, #2 is not a very good idea for the following reasons:

Most sustained samples have loop points which define a section of the waveform that continues to repeat as a note is held down indefinitely. Not only will you lose these helpful loop points when recording the output from a sampler, but your recorded waveform will often use up more memory than the original sample if you record some amount of what is actually looped in the original SoundFont. As an example: the Piano samples in GeneralUser GS are looped, but the instrument envelope causes each note to gradually fade out as the sound loops. If you record the FluidSynth output of one of these notes, you will not be able to recreate this loop due to the fade-out, and your sample will have to be very long to capture the entire note (20 seconds to capture middle-C vs. approx. 4 seconds of original sample data).
Sample stretching is used within SoundFonts to play the same sample over a range of keys. Most instruments will use multiple samples throughout the range of the keyboard (for example a piano that has a new sample every 2-3 keys). If you record FluidSynth's output, you will have to use your ear to determine where the instrument switches from one sample to another (not to even mention instruments that have different samples for each velocity layer). If you get this wrong, you may end up unnecessarily taking multiple recordings of the same sample (just at different pitches). This is a very easy mistake to make when some of the SoundFont's more advanced features such as filters and modulators are active. Also, the sample sounds most natural at its "root key", which is the note that the actual sample was taken at. Recording any pitch other than the root key will result in a small loss in audio quality from the original due to the interpolation of sample points when changing the sample's pitch and other factors. Unfortunately, discerning the root pitch is pretty much impossible to do by ear.

I guess this might be related to how many wave tables are usually used for a given instrument in the soundfont format. One per pitch? One for every envelope phase of every pitch?

This varies from instrument to instrument. Some instruments even use multiple samples or envelope regions that vary depending on how hard you hit the key.

My apologies if I am somewhat lacking coherence here, I am still trying to get a decent grasp on the subject matter.

No problem. You really should consider using Swami or another SoundFont editor to learn how SoundFonts are built. From there you will come to better understand what a sampler does and what you will need to do for your own project. You can also export the waveforms directly from these editors, which would be much, much better than trying to record them through FluidSynth.

Good luck, and God Bless!
-~Chris

_______________________________________________
fluid-dev mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/fluid-dev

From:	Michael Geis
Subject:	Re: [fluid-dev] complexity of soundfont synthesis engine
Date:	Wed, 21 Sep 2011 22:33:55 -0700 (PDT)