qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [QUESTION] tcg: Is concurrent storing and code translation of the sa


From: Richard Henderson
Subject: Re: [QUESTION] tcg: Is concurrent storing and code translation of the same code page considered as racing in MTTCG?
Date: Sun, 31 Jan 2021 13:01:29 -1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 1/31/21 1:38 AM, Liren Wei wrote:
> However, similar to the situation described in:
> https://lists.nongnu.org/archive/html/qemu-devel/2018-02/msg02529.html
> 
> When we have 2 vCPUs with one of them writing to the code page while
> the other just translated some code within that same page, the following
> situation might happen:
> 
>    vCPU thread 1 - writing      vCPU thread 2 - translating
>    -----------------------      -----------------------
>    TLB check -> slow path
>      notdirty_write()
>        set dirty flag
>      write to RAM
>                                 tb_gen_code()
>                                   tb_page_add()
>                                     tlb_protect_code()
> 
>    TLB check -> fast path
>                                       set TLB_NOTDIRTY
>      write to RAM
> executing unmodified code for this time
>                                 and maybe also for the next time, never
>                                 re-translate modified TBs.
> 
> 
> My question is:
>   Should the situation described above be considered as a bug or,
>   an intended behavior for QEMU (, so it's the programmer's fault
>   for not flushing the icache after modifying shared code page)?

Yes, this is a bug, because we are trying to support e.g. x86 which does not
require an icache flush.

I think the page lock, the TLB_NOTDIRTY setting, and a possible sync on the
setting, needs to happen before the bytes are read during translation.
Otherwise we don't catch the case above, nor do we catch

        CPU1                    CPU2
        ------------------      --------------------------
        TLB check -> fast
                                tb_gen_code() -> all of it
          write to ram

Also because of x86 (and other architectures in which a single instruction can
span a page boundary), I think this lock+set+sync sequence needs to happen on
demand in something called from the function set defined in
include/exec/translator.h

That also means that any target/cpu/ which has not been converted to use that
interface remains broken, and should be converted or deprecated.

Are you planning to work on this?


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]