[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 19/25] block_int-common.h: split function pointers in Bloc
From: |
Hanna Reitz |
Subject: |
Re: [PATCH v4 19/25] block_int-common.h: split function pointers in BlockDriver |
Date: |
Mon, 15 Nov 2021 13:00:07 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0 |
On 25.10.21 12:17, Emanuele Giuseppe Esposito wrote:
Similar to the header split, also the function pointers in BlockDriver
can be split in I/O and global state.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
---
include/block/block_int-common.h | 458 ++++++++++++++++---------------
1 file changed, 237 insertions(+), 221 deletions(-)
diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h
index 79a3d801d2..9857e775fe 100644
--- a/include/block/block_int-common.h
+++ b/include/block/block_int-common.h
@@ -96,6 +96,7 @@ typedef struct BdrvTrackedRequest {
struct BlockDriver {
+ /* Fields initialized in struct definition and never changed. */
I find this a bit difficult to understand. Perhaps we could be more
verbose to make it easier to read? Like
“These fields are initialized when this object is created, and are never
changed afterwards”
And I’d also add an empty line below to make visually clear that this
does not describe the single subsequent field (format_name) but the
whole chunk of fields.
I also wonder if we could substantiate the claim “never changed
afterwards” by making these fields consts. Then again, I suppose
technically none of the fields in BlockDriver is supposed to be mutable
(except for .list), neither these fields nor the function pointers...
Yeah, maybe scratch that idea.
const char *format_name;
int instance_size;
[...]
+ int coroutine_fn (*bdrv_save_vmstate)(BlockDriverState *bs,
+ QEMUIOVector *qiov,
+ int64_t pos);
+ int coroutine_fn (*bdrv_load_vmstate)(BlockDriverState *bs,
+ QEMUIOVector *qiov,
+ int64_t pos);
In patch 18, you classified bdrv_co_readv_vmstate() and
bdrv_co_writev_vmstate() as I/O functions. They call these methods,
though, so something doesn’t seem quite right.
Now I also remember you classified the global bdrv_save_vmstate() and
bdrv_load_vmstate() functions as GS. I don’t mind either way; AFAIU
they are I/O functions that are generally called under the BQL. But the
classification should be consistent, of course.
+
+ int (*bdrv_change_backing_file)(BlockDriverState *bs,
+ const char *backing_file, const char *backing_fmt);
+
+ void (*bdrv_eject)(BlockDriverState *bs, bool eject_flag);
+ void (*bdrv_lock_medium)(BlockDriverState *bs, bool locked);
Above bdrv_is_inserted(), there’s a comment (“removable device
specific”) that I think should be duplicated here.
+
+ /* TODO Better pass a option string/QDict/QemuOpts to add any rule? */
+ int (*bdrv_debug_breakpoint)(BlockDriverState *bs, const char *event,
+ const char *tag);
+ int (*bdrv_debug_remove_breakpoint)(BlockDriverState *bs,
+ const char *tag);
+ int (*bdrv_debug_resume)(BlockDriverState *bs, const char *tag);
+ bool (*bdrv_debug_is_suspended)(BlockDriverState *bs, const char *tag);
+ void (*bdrv_refresh_limits)(BlockDriverState *bs, Error **errp);
Originally, there was an empty line before bdrv_refresh_limits(), and
I’d keep it.
[...]
+ /**
+ * Try to get @bs's logical and physical block size.
+ * On success, store them in @bsz and return zero.
+ * On failure, return negative errno.
+ */
+ int (*bdrv_probe_blocksizes)(BlockDriverState *bs, BlockSizes *bsz);
Originally, there was a comment above this function (“I/O API, even
though if it's a filter jumps on parent”) that you added in patch 6 and
that I didn’t understand. Given this, I’m not unhappy to see it go
again, but now I wonder about its purpose even more...
+ /**
+ * Try to get @bs's geometry (cyls, heads, sectors)
+ * On success, store them in @geo and return 0.
+ * On failure return -errno.
+ * Only drivers that want to override guest geometry implement this
+ * callback; see hd_geometry_guess().
+ */
+ int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
(Here, too)
[...]
+ /**
+ * Register/unregister a buffer for I/O. For example, when the driver is
+ * interested to know the memory areas that will later be used in iovs, so
+ * that it can do IOMMU mapping with VFIO etc., in order to get better
+ * performance. In the case of VFIO drivers, this callback is used to do
+ * DMA mapping for hot buffers.
+ */
+ void (*bdrv_register_buf)(BlockDriverState *bs, void *host, size_t size);
+ void (*bdrv_unregister_buf)(BlockDriverState *bs, void *host);
+ QLIST_ENTRY(BlockDriver) list;
This field is interesting because it’s the only mutable field in the
whole struct. I think it should be separated from the function pointers
above and given a comment like “This field is modified only under the
BQL, and is part of the global state”.
+
+ /*
+ * I/O API functions. These functions are thread-safe.
+ *
+ * See include/block/block-io.h for more information about
+ * the I/O API.
+ */
+
+ int coroutine_fn (*bdrv_co_create)(BlockdevCreateOptions *opts,
+ Error **errp);
+ int coroutine_fn (*bdrv_co_create_opts)(BlockDriver *drv,
+ const char *filename,
+ QemuOpts *opts,
+ Error **errp);
Now this is really interesting. Technically I suppose these should work
in any thread, but trying to do so results in:
$ touch /tmp/iothread-create-test.qcow2
$ ./qemu-system-x86_64 -object iothread,id=iothr0 -qmp stdio <<EOF
{"execute": "qmp_capabilities"}
{"execute":"blockdev-add","arguments":{"node-name":"proto","driver":"file","filename":"/tmp/iothread-create-test.qcow2"}}
{"execute":"x-blockdev-set-iothread","arguments":{"node-name":"proto","iothread":"iothr0"}}
{"execute":"blockdev-create","arguments":{"job-id":"create","options":{"driver":"qcow2","file":"proto","size":0}}}
EOF
{"QMP": {"version": {"qemu": {"micro": 90, "minor": 1, "major": 6},
"package": "v6.2.0-rc0-40-gd02d5fe5fb-dirty"}, "capabilities": ["oob"]}}
{"return": {}}
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": 1636973542, "microseconds": 338117}, "event":
"JOB_STATUS_CHANGE", "data": {"status": "created", "id": "create"}}
{"timestamp": {"seconds": 1636973542, "microseconds": 338197}, "event":
"JOB_STATUS_CHANGE", "data": {"status": "running", "id": "create"}}
{"return": {}}
qemu: qemu_mutex_unlock_impl: Operation not permitted
[1] 86154 IOT instruction (core dumped) ./qemu-system-x86_64 -object
iothread,id=iothr0 -qmp stdio <<<''
So something’s fishy and perhaps we should investigate this... I mean,
I can’t really imagine a case where someone would need to run a
blockdev-create job in an I/O thread, but right now the interface allows
for it.
And then bdrv_create() is classified as global state, and also
bdrv_co_create_opts_simple(), which is supposed to be a drop-in function
for this .bdrv_co_create_opts function. So that can’t work.
Also, I believe there might have been some functions you classified as
GS that are called from .create* implementations. I accepted that,
given the abort I sketched above. However, if we classify image
creation as I/O, then those would need to be re-evaluated. For example,
qcow2_co_create_opts() calls bdrv_create_file(), which is a GS function.
Some of this issues could be addressed by making .bdrv_co_create_opts a
GS function and .bdrv_co_create an I/O function. I believe that would
be the ideal split, even though as shown above .bdrv_co_create doesn’t
work in an I/O thread, and then you have the issue of probably all
format drivers’ .bdrv_co_create implementations calling
bdrv_open_blockdev_ref(), which is a GS function.
(VMDK even calls blk_new_open(), blk_new_with_bs(), and blk_unref(),
none of which can ever be I/O functions, I think.)
I believe in practice the best is to for now classify all create-related
functions as GS functions. This is supported by the fact that
qmp_blockdev_create() specifically creates the create job in the main
context (with a TODO comment) and requires block drivers to error out
when they encounter a node in a different AioContext.
+ int coroutine_fn (*bdrv_co_amend)(BlockDriverState *bs,
+ BlockdevAmendOptions *opts,
+ bool force,
+ Error **errp);
+
/* aio */
BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs,
int64_t offset, int64_t bytes, QEMUIOVector *qiov,
[...]
@@ -443,47 +632,20 @@ struct BlockDriver {
Right above here (i.e. line 631), there’s an attribute
`has_variable_length`. I believe that should go up to the immutable
fields section.
int64_t (*bdrv_get_allocated_file_size)(BlockDriverState *bs);
BlockMeasureInfo *(*bdrv_measure)(QemuOpts *opts, BlockDriverState *in_bs,
Error **errp);
-
I’d keep this empty line.
Hanna
int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs,
int64_t offset, int64_t bytes, QEMUIOVector *qiov);
int coroutine_fn (*bdrv_co_pwritev_compressed_part)(BlockDriverState *bs,
int64_t offset, int64_t bytes, QEMUIOVector *qiov,
size_t qiov_offset);
- Re: [PATCH v4 19/25] block_int-common.h: split function pointers in BlockDriver,
Hanna Reitz <=