[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: aio_wait_bh_oneshot() thread-safety question
From: |
Kevin Wolf |
Subject: |
Re: aio_wait_bh_oneshot() thread-safety question |
Date: |
Tue, 24 May 2022 14:40:59 +0200 |
Am 24.05.2022 um 09:08 hat Paolo Bonzini geschrieben:
> On 5/23/22 18:04, Vladimir Sementsov-Ogievskiy wrote:
> >
> > I have a doubt about how aio_wait_bh_oneshot() works. Exactly, I see
> > that data->done is not accessed atomically, and doesn't have any barrier
> > protecting it..
> >
> > Is following possible:
> >
> > main-loop iothread
> > |
> > aio_wait_bh_oneshot() |
> > aio_bh_schedule_oneshot() |
> > | handle bh:
> > | 1. set data->done = true
> > | 2. call aio_wait_kick(), inserting the
> > | dummy bh into main context
> > |
> > ... in AIO_WAIT_WHILE():
> > handle dummy bh, go to next
> > iteration, but still read
> > data->done=false due to some
> > processor data reordering,
> > go to next iteration of polling
> > and hang
> Yes, barriers are missing:
>
> https://lore.kernel.org/qemu-devel/You6FburTi7gVyxy@stefanha-x1.localdomain/T/#md97146c6eae1fce2ddd687fdc3f2215eee03f6f4
>
> It seems like the issue was never observed, at least on x86.
Why is the barrier in aio_bh_enqueue() not enough? Is the comment there
wrong?
aio_notify() has another barrier. This is a little bit too late, but if
I misunderstood the aio_bh_enqueue() one, it could explain why it was
never observed.
Kevin