qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->childr


From: Kevin Wolf
Subject: Re: [RFC PATCH 0/5] Removal of AioContext lock, bs->parents and ->children: proof of concept
Date: Wed, 13 Apr 2022 16:51:35 +0200

Am 13.04.2022 um 15:43 hat Emanuele Giuseppe Esposito geschrieben:
> So this is a more concrete and up-to-date header.
> 
> Few things to notice:
> - we have a list of AioContext. They are registered once an aiocontext
> is created, and deleted when it is destroyed.
> This list is helpful because each aiocontext can only modify its own
> number of readers, avoiding unnecessary cacheline bouncing
> 
> - if a coroutine changes aiocontext, it's ok with regards to the
> per-aiocontext reader counter. As long as the sum is correct, there's no
> issue. The problem comes only once the original aiocontext is deleted,
> and at that point we need to move the count it held to a shared global
> variable, otherwise we risk to lose track of readers.

So the idea is that we can do bdrv_graph_co_rdlock() in one thread and
the corresponding bdrv_graph_co_rdunlock() in a different thread?

Would the unlock somehow remember the original thread, or do you use the
"sum is correct" argument and allow negative counter values, so you can
end up having count +1 in A and -1 in B to represent "no active
readers"? If this happens, it's likely to happen many times, so do we
have to take integer overflows into account then?

> - All synchronization between the flags explained in this header is of
> course handled in the implementation. But for now it would be nice to
> have a feedback on the idea/API.
> 
> So in short we need:
> - per-aiocontext counter
> - global list of aiocontext
> - global additional reader counter (in case an aiocontext is deleted)
> - global CoQueue
> - global has_writer flag
> - global QemuMutex to protect the list access
> 
> Emanuele
> 
> #ifndef BLOCK_LOCK_H
> #define BLOCK_LOCK_H
> 
> #include "qemu/osdep.h"
> 
> /*
>  * register_aiocontext:
>  * Add AioContext @ctx to the list of AioContext.
>  * This list is used to obtain the total number of readers
>  * currently running the graph.
>  */
> void register_aiocontext(AioContext *ctx);
> 
> /*
>  * unregister_aiocontext:
>  * Removes AioContext @ctx to the list of AioContext.
>  */
> void unregister_aiocontext(AioContext *ctx);
> 
> /*
>  * bdrv_graph_wrlock:
>  * Modify the graph. Nobody else is allowed to access the graph.
>  * Set global has_writer to 1, so that the next readers will wait
>  * that writer is done in a coroutine queue.
>  * Then keep track of the running readers by counting what is the total
>  * amount of readers (sum of all aiocontext readers), and wait until
>  * they all finish with AIO_WAIT_WHILE.
>  */
> void bdrv_graph_wrlock(void);

Do we need a coroutine version that yields instead of using
AIO_WAIT_WHILE() or are we sure this will only ever be called from
non-coroutine contexts?

> /*
>  * bdrv_graph_wrunlock:
>  * Write finished, reset global has_writer to 0 and restart
>  * all readers that are waiting.
>  */
> void bdrv_graph_wrunlock(void);
> 
> /*
>  * bdrv_graph_co_rdlock:
>  * Read the bs graph. Increases the reader counter of the current
> aiocontext,
>  * and if has_writer is set, it means that the writer is modifying
>  * the graph, therefore wait in a coroutine queue.
>  * The writer will then wake this coroutine once it is done.
>  *
>  * This lock cannot be taken recursively.
>  */
> void coroutine_fn bdrv_graph_co_rdlock(void);

What prevents it from being taken recursively when it's just a counter?
(I do see however, that you can't take a reader lock while you have the
writer lock or vice versa because it would deadlock.)

Does this being a coroutine_fn mean that we would have to convert QMP
command handlers to coroutines so that they can take the rdlock while
they don't expect the graph to change? Or should we have a non-coroutine
version, too, that works with AIO_WAIT_WHILE()?

Or should this only be taken for very small pieces of code directly
accessing the BdrvChild objects, and high-level users like QMP commands
shouldn't even consider themselves readers?

> /*
>  * bdrv_graph_rdunlock:
>  * Read terminated, decrease the count of readers in the current aiocontext.
>  * If the writer is waiting for reads to finish (has_writer == 1), signal
>  * the writer that we are done via aio_wait_kick() to let it continue.
>  */
> void coroutine_fn bdrv_graph_co_rdunlock(void);
> 
> #endif /* BLOCK_LOCK_H */

I expect that in the final version, we might want to have some sugar
like a WITH_BDRV_GRAPH_RDLOCK_GUARD() macro, but obviously that doesn't
affect the fundamental design.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]