|
From: | Paolo Bonzini |
Subject: | Re: [PATCH 4/6] coroutine-lock: reimplement CoRwlock to fix downgrade bug |
Date: | Wed, 24 Mar 2021 17:40:23 +0100 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 |
On 24/03/21 17:15, Stefan Hajnoczi wrote:
On Wed, Mar 17, 2021 at 07:00:11PM +0100, Paolo Bonzini wrote:+static void qemu_co_rwlock_maybe_wake_one(CoRwlock *lock) +{ + CoRwTicket *tkt = QSIMPLEQ_FIRST(&lock->tickets); + Coroutine *co = NULL; + + /* + * Setting lock->owners here prevents rdlock and wrlock from + * sneaking in between unlock and wake. + */ + + if (tkt) { + if (tkt->read) { + if (lock->owners >= 0) { + lock->owners++; + co = tkt->co; + } + } else { + if (lock->owners == 0) { + lock->owners = -1; + co = tkt->co; + } + } + } + + if (co) { + QSIMPLEQ_REMOVE_HEAD(&lock->tickets, next); + qemu_co_mutex_unlock(&lock->mutex); + aio_co_wake(co);I find it hard to reason about QSIMPLEQ_EMPTY(&lock->tickets) callers that execute before co is entered. They see an empty queue even though a coroutine is about to run. Updating owners above ensures that the code correctly tracks the state of the rwlock, but I'm not 100% confident about this aspect of the code.
Good point. The invariant when lock->mutex is released should be clarified; a better way to phrase the comment above "if (tkt)" is:
The invariant when lock->mutex is released is that every ticket is tracked in either lock->owners or lock->tickets. By updating lock->owners here, rdlock/wrlock/upgrade will block even if they execute between qemu_co_mutex_unlock and aio_co_wake.
- self->locks_held--; + lock->owners--; + QSIMPLEQ_INSERT_TAIL(&lock->tickets, &my_ticket, next); + qemu_co_rwlock_maybe_wake_one(lock); + qemu_coroutine_yield(); + assert(lock->owners == -1);lock->owners is read outside lock->mutex here. Not sure if this can cause problems.
True. It is okay though because lock->owners cannot change until this coroutine unlocks. A worse occurrence of the issue is in rdlock:
assert(lock->owners >= 1);/* Possibly wake another reader, which will wake the next in line. */
qemu_co_mutex_lock(&lock->mutex);where the assert should be moved after taking the lock, or possibly changed to use qatomic_read. (I prefer the former).
locks_held is kept unchanged across qemu_coroutine_yield() even though the read lock has been released. rdlock() and wrlock() only increment locks_held after acquiring the rwlock. In practice I don't think it matters, but it seems inconsistent. If locks_held is supposed to track tickets (not just coroutines currently holding a lock), then rdlock() and wrlock() should increment before yielding.
locks_held (unlike owners) is not part of the lock, it's part of the Coroutine and only used for debugging (asserting that terminating coroutines are not holding any lock).
Paolo
[Prev in Thread] | Current Thread | [Next in Thread] |