qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block: bdrv_set_backing_hd(): use drained section


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH] block: bdrv_set_backing_hd(): use drained section
Date: Tue, 25 Jan 2022 13:12:19 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.0

25.01.2022 12:24, Paolo Bonzini wrote:
On 1/24/22 18:37, Vladimir Sementsov-Ogievskiy wrote:
Graph modifications should be done in drained section. stream_prepare()
handler of block stream job call bdrv_set_backing_hd() without using
drained section and it's theoretically possible that some IO request
will interleave with graph modification and will use outdated pointers
to removed block nodes.

Some other callers use bdrv_set_backing_hd() not caring about drained
sections too. So it seems good to make a drained section exactly in
bdrv_set_backing_hd().

Emanuele has a similar patch in his series to protect all graph
modifications with drains:

@@ -3456,6 +3478,11 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,

      assert(qemu_in_main_thread());

+    bdrv_subtree_drained_begin_unlocked(bs);
+    if (backing_hd) {
+        bdrv_subtree_drained_begin_unlocked(backing_hd);
+    }
+
      ret = bdrv_set_backing_noperm(bs, backing_hd, tran, errp);
      if (ret < 0) {
          goto out;
@@ -3464,6 +3491,10 @@ int bdrv_set_backing_hd(BlockDriverState *bs, 
BlockDriverState *backing_hd,
      ret = bdrv_refresh_perms(bs, errp);
  out:
      tran_finalize(tran, ret);
+    if (backing_hd) {
+        bdrv_subtree_drained_end_unlocked(backing_hd);
+    }
+    bdrv_subtree_drained_end_unlocked(bs);

      return ret;
  }

so the idea at least is correct.

I don't object to fixing this independently, but please check
1) if a subtree drain would be more appropriate, 2) whether
backing_hd should be drained as well, 3) whether we're guaranteed
to be holding the AioContext lock as required for
bdrv_drained_begin/end.


Hmm.

1. Subtree draining of backing_hd will not help, as bs is not drained, we still 
may have in-fight request in bs, touching old bs->backing.

2. I think non-recursive drain of bs is enough. We modify only bs node, so we 
should drain it. backing_hd itself is not modified. If backing_hd participate in 
some other backing chain - it's not touched, and in-flight requests in that other 
chain are not broken by modification, so why to drain it? Same for old 
bs->backing and other bs children. We are not interested in in-flight requests 
in subtree which are not part of request in bs. So, if no inflight requests in bs, 
we can modify bs and not care about requests in subtree.

3. Jobs are bound to aio context, so I believe that they care to hold AioContext 
lock. For example, on path job_prepare may be called through job_exit(), 
job_exit() does aio_context_acquire(job->aio_context), or it may be called 
through job_cancel(), which seems to be called under aio_context_acquire() as 
well. So, seems in general we care about it, and of course bdrv_set_backing_hd() 
must be called with AioContext lock held. If for some code path it isn't, it's a 
bug..


--
Best regards,
Vladimir



reply via email to

[Prev in Thread] Current Thread [Next in Thread]