fluid-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [fluid-dev] Parallelize rendering using openMP


From: Ceresa Jean-Jacques
Subject: Re: [fluid-dev] Parallelize rendering using openMP
Date: Sat, 14 Apr 2018 23:59:55 +0200 (CEST)

Thanks for yours awswers

 

>Apparently the soundfonts I used were not polyphonic enough

>Using FluidR3_GM.sf2 the cpu load looks better, but I'm yet quite far from the "perfect" scalability that your profiling interface gives you JJC.

 

Effectivelly, your machine is fast and in this case playing MIDI file to simulate a notes (voices) generator isn't not efficient. This is why the profiling interface have is own notes generator (but it is still limited to 256 x 16  notes !).

The most important is that fluidsynth are able to play constant number of voices during measurement. This gives consecutives measurement the same cpu load result. This makes any future performances measurements easily much more predictive.

Note:During my experiment, initially i have noticed that result between consecutives measurement was not constant. Quickly, i realized that a backgroud process was running. The job of this process was to economize energy . It was doing this by stealing cpu cycle!. Of course any performance measurement aren' possible with this kind of jobs or services running silently behind the scene.

 

>Additionally I want to revise the current implementation, like using a parallel logarithmic buffer reduction to mix audio between threads or rethinking data layout and memory accesses in general, >hoping this makes it more efficient.

Interresting. Looking the code (in the past) i have noticed that a lot of things perhaps could be enhanced arround the following subject:

1) avoiding mutual access to the "active list of voices" between  "primary tasks" and the pool of "extra tasks".

   - breaking the unique list in local list for each task.

   - load balancing (same number of voices in each local list).

2) optimizing mixing of buffers between "primary" task and "extra task" (to avoid actual possible synchronization overhead domination).

3) optimizing fluid_cond_signal(), fluid_cond_wait() each time the associated mutex is pointless.

Of course all this is easier to say than to do :).

jjc

> Message du 14/04/18 17:58
> De : "Tom M." <address@hidden>
> A : address@hidden
> Copie à :
> Objet : Re: [fluid-dev] Parallelize rendering using openMP
>
> Thanks for the feedback so far.
>
> > Please are you using a very fast machine ? did you ask to fluidsynth to play sufficient number of notes ?
>
> I'm on a Intel Core i5-3570K @ 3.40GHz. I tested several midi files that have instruments playing on all 16 channels. Apparently the soundfonts I used were not polyphonic enough, you're right. Using FluidR3_GM.sf2 the cpu load looks better, but I'm yet quite far from the "perfect" scalability that your profiling interface gives you JJC.
>
> > How did you come to the conclusion that the synchronization overhead dominates?
>
> Admittedly this might be a wrong/premature conclusion based on my observations + looking at the source code. I took a look at the callgraph generated with valgrind --tool=callgrind ./fluidsynth. Synchronization functions like g_mutex_lock() or g_cond_wait() are called quite often by fluid_mixer_thread_func(). Although it also reports to be not that expensive. Still I think it's worth evaluating what job openMP and other refactorings can do here. David Henningsson once told me that the parallel renderer was more like a (failed) experiment. So please see this current work as my little experiment.
>
> > I do wonder though why OpenMP can do a better job than the current code.
>
> openMP provides different scheduling strategies to process for loops. Also this restriction VOICES_PER_THREAD (==8) to avoid thread overhead seems quite magic to me (it probably worked well when David tested it, still, why is 8 the right number?). Overall I'm not sure whether openMP alone can do a better job. It definitely reduces complexity of the code. Additionally I want to revise the current implementation, like using a parallel logarithmic buffer reduction to mix audio between threads or rethinking data layout and memory accesses in general, hoping this makes it more efficient.
>
> Tom
>
>
> _______________________________________________
> fluid-dev mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/fluid-dev
>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]