On 06/19/2011 03:31 PM, Paolo Bonzini wrote:
On Wed, Jun 15, 2011 at 21:19, Gwenael Casaccio<address@hidden> wrote:
293 987 770 (in _gst_interpret)
53% of time of execution is spent in __tls_get_addr
It's time to optimize it a bit no ?
I was thinking of putting in _gst_interpret some variables at least
_gst_mem
You can put all variables in a single struct, and save the address
(&x) of that struct as a local variable in _gst_interpret. It's also
interesting to try -ftls-model=local-exec. Pick the fastest of the
two. :-)
Finally, the penalty for static linking ought to be very low. That's
the limit to which you should aim.
Paolo
Hehe this is what I've done :)
For the -ftls-model=local-exe I've tried but I've a link error (-fPIC
needed), I've added it but it failed too..
I've made another pass on the implementation with helgrind (tool for
valgrind) I've removed a **lot** of threading issues I can load 20
images without any crashes :D
There is still an issue with the bootstrapping but when the memory is
released.