qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] tcg: reworking tb_invalidated_flag


From: Sergey Fedorov
Subject: Re: [Qemu-devel] tcg: reworking tb_invalidated_flag
Date: Thu, 31 Mar 2016 17:35:52 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 31/03/16 16:40, Paolo Bonzini wrote:
>
> On 31/03/2016 15:14, Sergey Fedorov wrote:
>> On 30/03/16 21:13, Paolo Bonzini wrote:
>>> On 30/03/2016 19:08, Sergey Fedorov wrote:
>>>> The second approach is to make 'tb_invalidated_flag' per-CPU. This
>>>> would be conceptually similar to what we have, but would give us thread
>>>> safety. With this approach, we need to be careful to correctly clear and
>>>> set the flag.
>>> You can just ensure that setting and clearing it is done under tb_lock.
>> So it could remain sitting in 'tcg_ctx.tb_ctx'. I'm just wondering what
>> could be real benefits for making it per-CPU then?
> All CPUs need to observe it in order to clear their own local next_tb
> variable.  It is not enough to do that once, so it has to be per-CPU.

So for each vCPU thread we have a separate flag to clear it safely. Got
it, thanks.

>
>>> Because TranslationBlocks live in tcg_ctx.tb_ctx.tbs you need
>>> special code to exit all CPUs at tb_flush time, otherwise you risk that
>>> a tb_alloc reuses a TranslationBlock while it is in use by a VCPU.
>> Looks like no matter which approach we use, it's ultimately necessary to
>> ensure all CPUs have exited from translated code before the translation
>> buffer may be safely flushed.
> My plan was to use some kind of double buffering, where only half of
> code_gen_buffer is in use.  At the end of tb_flush you call cpu_exit()
> on all CPUs, so that CPUs stop executing chained TBs from the old half
> before they can see one from the new half.
>
> If code_gen_buffer is static you have to preallocate two buffers (and
> two tbs arrays) and waste one of them; while it is theoretically
> possible to have CPUs still executing from the old half while you finish
> the new half, it can be more or less ignored.
>
> If it is dynamic, the previously used areas can be freed with call_rcu,
> and you can safely allocate a new code_gen_buffer and tbs array.
>
> I haven't thought much about it; it might require keeping a cache of the
> tbs array per CPU, and possibly changing the code under "if
> (tcg_ctx.tb_ctx.tb_invalidated_flag)" to simply exit cpu_exec.

Maybe save this idea for latter? :) We'd better use a simpler approach
at first and then move on and optimize. BTW, a few years ago I came
across an interesting paper on code cache eviction granularities [1].

[1]
http://www.cs.virginia.edu/kim/courses/cs851/papers/hazelwood04mediumgrained.pdf

Kind regards,
Sergey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]