Re: [fluid-dev] Questions about soundfont, voice rendering, pitch shift.

I'm trying to make a soundfont from a single audio sample (somewhere between octave C3-C4). The sample is about 4 seconds. So it is set to the whole range of notes (0-127).

What happens is the lower octave notes seems to lose some volume.

More problematic are the upper octave notes, they sounded so short (1-2 seconds, or less). It may sound OK for short-duration sounds like like piano, xylophone, but for synth strings, sawWave, squareWave, pads... It also sounds much less like the original sound.

Both of these reasons is why "real" soundfonts do not use a single audio sample for the full 0-127 range of notes. If you strike a C4 ("middle C") piano key, it will sound like a piano. If you double the sound's speed to a C5, it will probably still sound OK; likewise if you halve it to a C3. But if you speed it up 32 times to a C9 (MIDI pitch value 120), it will sound nothing like if you had pressed the piano key C9 -- the synthesiser is not realistically simulating all the changes in physics when a shorter string is struck, merely speeding up a wave file.

The note will also sound much shorter, as the sound has been sped up (both increasing its pitch and decreasing its duration). That's vaguely OK for a piano, as high-pitch piano notes do last a lot less time than low-pitched ones, but whether or not it is acceptable depends on the instrument.

This is why a "real" soundfont will use maybe five or six different samples of the instrument, spread out across the range. Each sample is used for a particular range of notes (for example, there might be a C4 piano sample which is used unchanged for the C4 note, and also used by speed-up or slow-down for all notes in the range C3 to C5, but beyond that a different sample is used). Once you go past C5, it switches to a different, naturally higher-pitched sample. You can often hear this change in samples if you are playing the notes in increasing order, especially with choir instruments (where the high-pitched samples sound nothing like the low-pitched ones).

I recommend you use Swami (or your soundfont editor) to explore how an existing soundfont is made -- especially a choir one. You should be able to get access to the individual samples, and try to allow, say, a choir designed for a C2 to be played in the C6 range, and hear how silly it sounds.

I know that in pitch shifting, tempo can be kept constant, which also sound much closer to the original sound characteristics. Some library, like:

www.surina.net/soundtouch/

can do so and is part of apps like mhwaveedit, rezound, ardour, audacity... I have tried on a whole song (non-realtime) in audacity and it sounds good.

Yeah, this technology is good for speeding up and slowing down a song without changing the pitch, or changing the pitch without affecting the speed. But I think it's a bit expensive to use on individual notes in a MIDI synthesis. No MIDI engine that I know of does this, and it probably wouldn't sound right either.

As I was saying above, there is much more to it than just the speed being shortened. For one thing, on a piano or guitar or other string instruments, it is actually realistic for the speed to shorten when the pitch is raised. Just fixing the speed won't make a sample which has been distorted way out of its range sound more realistic. You really need to have a bunch of different samples. (High-quality samples, I think, typically have a different sample for every single note, to avoid any distortion problems at all.)

So, I wonder if Fluidsynth voice rendering can be changed to keep the sound length constant (same as original sample)? Of course, this is just some experiment for the time being. Does anyone have time to look into it? Or perhaps, give me some pointers as to where to look in Fluidsynth code so I can try it on my own?

I don't think there is any code to do this.

If you really can't get multiple samples, and you think the "shift-pitch-without-speed" effect will be helpful, why not use Audacity or some other tool to manually adjust the pitch of your sample to about 6 different pitches, and then use those as different samples in the soundfont?

Matt

From:	Matt Giuca
Subject:	Re: [fluid-dev] Questions about soundfont, voice rendering, pitch shift...
Date:	Thu, 3 Feb 2011 11:32:08 +1100