Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues

From:	Paolo Bonzini
Subject:	Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
Date:	Mon, 7 Oct 2019 16:58:30 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 07/10/19 16:44, dann frazier wrote:
> On Mon, Oct 07, 2019 at 01:06:20PM +0200, Paolo Bonzini wrote:
>> On 02/10/19 11:23, Jan Glauber wrote:
>>> I've looked into this on ThunderX2. The arm64 code generated for the
>>> atomic_[add|sub] accesses of ctx->notify_me doesn't contain any
>>> memory barriers. It is just plain ldaxr/stlxr.
>>>
>>> From my understanding this is not sufficient for SMP sync.
>>>
>>> If I read this comment correct:
>>>
>>>     void aio_notify(AioContext *ctx)
>>>     {
>>>         /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
>>>          * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
>>>          */
>>>         smp_mb();
>>>         if (ctx->notify_me) {
>>>
>>> it points out that the smp_mb() should be paired. But as
>>> I said the used atomics don't generate any barriers at all.
>>
>> Based on the rest of the thread, this patch should also fix the bug:
>>
>> diff --git a/util/async.c b/util/async.c
>> index 47dcbfa..721ea53 100644
>> --- a/util/async.c
>> +++ b/util/async.c
>> @@ -249,7 +249,7 @@ aio_ctx_check(GSource *source)
>>      aio_notify_accept(ctx);
>>  
>>      for (bh = ctx->first_bh; bh; bh = bh->next) {
>> -        if (bh->scheduled) {
>> +        if (atomic_mb_read(&bh->scheduled)) {
>>              return true;
>>          }
>>      }
>>
>>
>> And also the memory barrier in aio_notify can actually be replaced
>> with a SEQ_CST load:
>>
>> diff --git a/util/async.c b/util/async.c
>> index 47dcbfa..721ea53 100644
>> --- a/util/async.c
>> +++ b/util/async.c
>> @@ -349,11 +349,11 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx)
>>  
>>  void aio_notify(AioContext *ctx)
>>  {
>> -    /* Write e.g. bh->scheduled before reading ctx->notify_me.  Pairs
>> -     * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll.
>> +    /* Using atomic_mb_read ensures that e.g. bh->scheduled is written 
>> before
>> +     * ctx->notify_me is read.  Pairs with atomic_or in aio_ctx_prepare or
>> +     * atomic_add in aio_poll.
>>       */
>> -    smp_mb();
>> -    if (ctx->notify_me) {
>> +    if (atomic_mb_read(&ctx->notify_me)) {
>>          event_notifier_set(&ctx->notifier);
>>          atomic_mb_set(&ctx->notified, true);
>>      }
>>
>>
>> Would you be able to test these (one by one possibly)?
> 
> Paolo,
>   I tried them both separately and together on a Hi1620 system, each
> time it hung in the first iteration. Here's a backtrace of a run with
> both patches applied:

Ok, not a great start...  I'll find myself an aarch64 machine and look
at it more closely.  I'd like the patch to be something we can
understand and document, since this is probably the second most-used
memory barrier idiom in QEMU.

Paolo

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Jan Glauber, 2019/10/02
- Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Paolo Bonzini, 2019/10/02
  - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Jan Glauber, 2019/10/02
    - memory barriers and ATOMIC_SEQ_CST on aarch64 (was Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues), Paolo Bonzini, 2019/10/02
    - Re: memory barriers and ATOMIC_SEQ_CST on aarch64 (was Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues), Torvald Riegel, 2019/10/02
    - Re: memory barriers and ATOMIC_SEQ_CST on aarch64 (was Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues), Paolo Bonzini, 2019/10/02
- Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Paolo Bonzini, 2019/10/07
  - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Jan Glauber, 2019/10/07
  - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, dann frazier, 2019/10/07
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Paolo Bonzini <=
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Jan Glauber, 2019/10/09
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Paolo Bonzini, 2019/10/09
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Jan Glauber, 2019/10/11
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Paolo Bonzini, 2019/10/11
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, Jan Glauber, 2019/10/11
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, dann frazier, 2019/10/11
    - Re: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, no-reply, 2019/10/11
    - Re: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, no-reply, 2019/10/11
    - Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues, dann frazier, 2019/10/11

Prev by Date: Re: [PATCH] cpus: kick all vCPUs when running thread=single
Next by Date: Re: [PATCH] tests: document how to update acpi tables
Previous by thread: Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
Next by thread: Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
Index(es):
- Date
- Thread