qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/5] Slow-path for atomic instruction translation


From: alvise rigo
Subject: Re: [Qemu-devel] [RFC 0/5] Slow-path for atomic instruction translation
Date: Wed, 6 May 2015 18:21:12 +0200

On Wed, May 6, 2015 at 6:00 PM, Mark Burton <address@hidden> wrote:
> By the way - would it help to send the atomic patch we did separately from 
> the whole MTTCG patch set?

I don't think you should spend time on this. As you said it's short, I
can do it by myself when necessary.

Thank you,
alvise

> Or have you already taken a look at that - it’s pretty short.
>
> Cheers
>
> Mark.
>
>
>> On 6 May 2015, at 17:51, Paolo Bonzini <address@hidden> wrote:
>>
>> On 06/05/2015 17:38, Alvise Rigo wrote:
>>> This patch series provides an infrastructure for atomic
>>> instruction implementation in QEMU, paving the way for TCG multi-threading.
>>> The adopted design does not rely on host atomic
>>> instructions and is intended to propose a 'legacy' solution for
>>> translating guest atomic instructions.
>>>
>>> The underlying idea is to provide new TCG instructions that guarantee
>>> atomicity to some memory accesses or in general a way to define memory
>>> transactions. More specifically, a new pair of TCG instructions are
>>> implemented, qemu_ldlink_i32 and qemu_stcond_i32, that behave as
>>> LoadLink and StoreConditional primitives (only 32 bit variant
>>> implemented).  In order to achieve this, a new bitmap is added to the
>>> ram_list structure (always unique) which flags all memory pages that
>>> could not be accessed directly through the fast-path, due to previous
>>> exclusive operations. This new bitmap is coupled with a new TLB flag
>>> which forces the slow-path exectuion. All stores which take place
>>> between an LL/SC operation by other vCPUs in the same memory page, will
>>> fail the subsequent StoreConditional.
>>>
>>> In theory, the provided implementation of TCG LoadLink/StoreConditional
>>> can be used to properly handle atomic instructions on any architecture.
>>>
>>> The new slow-path is implemented such that:
>>> - the LoadLink behaves as a normal load slow-path, except for cleaning
>>>  the dirty flag in the bitmap. The TLB entries created from now on will
>>>  force the slow-path. To ensure it, we flush the TLB cache for the
>>>  other vCPUs
>>> - the StoreConditional behaves as a normal store slow-path, except for
>>>  checking the state of the dirty bitmap and returning 0 or 1 whether or
>>>  not the StoreConditional succeeded (0 when no vCPU has touched the
>>>  same memory in the mean time).
>>>
>>> All those write accesses that are forced to follow the 'legacy'
>>> slow-path will set the accessed memory page to dirty.
>>>
>>> In this series only the ARM ldrex/strex instructions are implemented.
>>> The code was tested with bare-metal test cases and with Linux, using
>>> upstream QEMU.
>>>
>>> This work has been sponsored by Huawei Technologies Dusseldorf GmbH.
>>>
>>> Alvise Rigo (5):
>>>  exec: Add new exclusive bitmap to ram_list
>>>  Add new TLB_EXCL flag
>>>  softmmu: Add helpers for a new slow-path
>>>  tcg-op: create new TCG qemu_ldlink and qemu_stcond instructions
>>>  target-arm: translate: implement qemu_ldlink and qemu_stcond ops
>>
>> That's pretty cool.
>>
>> Paolo
>
>
>          +44 (0)20 7100 3485 x 210
>  +33 (0)5 33 52 01 77x 210
>
>         +33 (0)603762104
>         mark.burton
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]