[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 00/25] block layer: split block APIs in global state and I

From: Hanna Reitz
Subject: Re: [PATCH v4 00/25] block layer: split block APIs in global state and I/O
Date: Mon, 15 Nov 2021 17:03:28 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:
Currently, block layer APIs like block-backend.h contain a mix of
functions that are either running in the main loop and under the
BQL, or are thread-safe functions and run in iothreads performing I/O.
The functions running under BQL also take care of modifying the
block graph, by using drain and/or aio_context_acquire/release.
This makes it very confusing to understand where each function
runs, and what assumptions it provided with regards to thread

We call the functions running under BQL "global state (GS) API", and
distinguish them from the thread-safe "I/O API".

The aim of this series is to split the relevant block headers in
global state and I/O sub-headers.

Despite leaving quite some comments, the series and the split seem reasonable to me overall.  (This is a pretty big series, after all, so those “some comments” stack up against a majority of changes that seem OK to me. :))

One thing I noticed while reviewing is that it’s really hard to verify that no I/O function calls a GS function.  What would be wonderful is some function marker like coroutine_fn that marks GS functions (or I/O functions) and that we could then verify the call paths.  But AFAIU we’ve always wanted precisely that for coroutine_fn and still don’t have it, so this seems like extremely wishful thinking... :(

I think most of the issues I found can be fixed (or are even irrelevant), the only thing that really worries me are the two places that are clearly I/O paths that call permission functions: Namely first block_crypto_amend_options_generic_luks() (part of the luks block driver’s .bdrv_co_amend implementation), which calls bdrv_child_refresh_perms(); and second fuse_do_truncate(), which calls blk_set_perm().

In the first case, we need this call so that we don’t permanently hog the WRITE permission for the luks file, which used to be a problem, I believe.  We want to unshare the WRITE permission (and apparently also CONSISTENT_READ) during the key update, so we need some way to temporarily update the permissions.

I only really see four solutions for this:
(1) We somehow make the amend job run in the main context under the BQL and have it prevent all concurrent I/O access (seems bad) (2) We can make the permission functions part of the I/O path (seems wrong and probably impossible?) (3) We can drop the permissions update and permanently require the permissions that we need when updating keys (I think this might break existing use cases) (4) We can acquire the BQL around the permission update call and perhaps that works?

I don’t know how (4) would work but it’s basically the only reasonable solution I can come up with.  Would this be a way to call a BQL function from an I/O function?

As for the second case, the same applies as above, with the differences that we have no jobs, so this code must always run in the block device’s AioContext (I think), which rules out (1); but (3) would become easier (i.e. require the RESIZE permission all the time), although that too might have an impact on existing users (don’t think so, though).  In any case, if we could do (4), that would solve the problem here, too.

And finally, another notable thing I noticed is that the way how create-related functions are handled is inconsistent.  I believe they should all be GS functions; qmp_blockdev_create() seems to agree with me on this, but we currently seem to have some bugs there.  It’s possible to invoke blockdev-create on a block device that’s in an I/O thread, and then qemu crashes.  Oops.  (The comment in qmp_blockdev_create() says that the block drivers’ implementations should prevent this, but apparently they don’t...?) In any case, that’s a pre-existing bug, of course, that doesn’t concern this series (other than that it suggests that “create” functions should be classified as GS).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]