qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 03/25] assertions for block global state API


From: Emanuele Giuseppe Esposito
Subject: Re: [PATCH v4 03/25] assertions for block global state API
Date: Mon, 15 Nov 2021 13:27:46 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0


@@ -5958,6 +6043,7 @@ const char *bdrv_get_parent_name(const BlockDriverState *bs)
  /* TODO check what callers really want: bs->node_name or blk_name() */
  const char *bdrv_get_device_name(const BlockDriverState *bs)
  {
+    assert(qemu_in_main_thread());
      return bdrv_get_parent_name(bs) ?: "";
  }

This function is invoked from qcow2_signal_corruption(), which comes generally from an I/O path.  Is it safe to assert that we’re in the main thread here?

Well, the question is probably rather whether this needs really be a considered a global-state function, or whether putting it in common or I/O is fine.  I believe you’re right given that it invokes bdrv_get_parent_name(), it cannot be thread-safe, but then we’ll have to change qcow2_signal_corruption() so it doesn’t invoke this function.


I think that the assertion is wrong here. As long as a single caller is not under BQL, we cannot keep the function in global-state headers. It should go into block-io.h, together with bdrv_get_parent_name and bdrv_get_device_or_node_name (caller of bdrv_get_parent_name).

Since bdrv_get_parent_name only scans through the list of bs->parents, as long as we can assert that the parent list is modified only under BQL + drain, it is safe to be read and can be I/O.

[...]

diff --git a/block/io.c b/block/io.c
index bb0a254def..c5d7f8495e 100644
--- a/block/io.c
+++ b/block/io.c

[...]

@@ -544,6 +546,7 @@ void bdrv_drained_end(BlockDriverState *bs)
  void bdrv_drained_end_no_poll(BlockDriverState *bs, int *drained_end_counter)
  {
+    assert(qemu_in_main_thread());
      bdrv_do_drained_end(bs, false, NULL, false, drained_end_counter);
  }

Why is bdrv_drained_end an I/O function and this is a GS function, even though it does just a subset?

Right this is clearly called in bdrv_drained_end. I don't know how is it possible that no test managed to trigger this assertion... This is actually invoked in both unit and qemu-iothread tests...


@@ -586,12 +589,14 @@ void bdrv_unapply_subtree_drain(BdrvChild *child, BlockDriverState *old_parent)
  void coroutine_fn bdrv_co_drain(BlockDriverState *bs)
  {
      assert(qemu_in_coroutine());
+    assert(qemu_in_main_thread());
      bdrv_drained_begin(bs);
      bdrv_drained_end(bs);
  }
  void bdrv_drain(BlockDriverState *bs)
  {
+    assert(qemu_in_main_thread());
      bdrv_drained_begin(bs);
      bdrv_drained_end(bs);
  }

Why are these GS functions when both bdrv_drained_begin() and bdrv_drained_end() are I/O functions?

I can understand making the drain_all functions GS functions, but it seems weird to say it’s an I/O function when a single BDS is drained via bdrv_drained_begin() and bdrv_drained_end(), but not via bdrv_drain(), which just does both.

(I can see that there are no I/O path callers, but I still find it strange.)

The problem as you saw is on the path callers: bdrv_drain seems to be called only in main thread, while others or similar drains are also coming from I/O. I would say maybe it's a better idea to just put everything I/O, maybe (probably) there are cases not caught by the tests.

The only ones in global-state will be:
- bdrv_apply_subtree_drain and bdrv_unapply_subtree_drain (called only in .attach and .detach callbacks of BdrvChildClass, run under BQL).
- bdrv_drain_all_end_quiesce (called only by bdrv_close thus under BQL).
- bdrv_drain_all_begin, bdrv_drain_all_end and bdrv_drain_all (have already BQL assertions)

In general, this is the underlying problem with these assertions: if a test doesn't test a specific code path, an unneeded assertion might pass undetected.


[...]

@@ -2731,6 +2742,7 @@ int bdrv_block_status_above(BlockDriverState *bs, BlockDriverState *base,   int bdrv_block_status(BlockDriverState *bs, int64_t offset, int64_t bytes,                         int64_t *pnum, int64_t *map, BlockDriverState **file)
  {
+    assert(qemu_in_main_thread());
      return bdrv_block_status_above(bs, bdrv_filter_or_cow_bs(bs),
                                     offset, bytes, pnum, map, file);
  }

Why is this a GS function as opposed to all other block-status functions?  Because of the bdrv_filter_or_cow_bs() call?

This function is in block.io but has the assertion. I think it's a proper I/O but I forgot to remove the assertion in my various trial&errors.

Let me know if you disagree on anything of what I said.

Thank you,
Emanuele




reply via email to

[Prev in Thread] Current Thread [Next in Thread]