qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Questions about losing the write lock of raw-format disks after migr


From: Hanna Reitz
Subject: Re: Questions about losing the write lock of raw-format disks after migration
Date: Thu, 25 Nov 2021 13:52:23 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.3.0

On 24.11.21 13:56, Peng Liang via wrote:
Hi folks,

When we test migration with raw-format disk, we found that the QEMU
process in the dst will lose the write lock after migration.  However,
the QEMU process in the dst will still hold the write lock for
qcow2-format disk.

After reading some block layer's code, I found that the first
blk_set_perm in blk_root_activate will set blk->shared_perm to
BLK_PERM_ALL (disable all shared permissions?).  Then in
blk_vm_state_changed, blk_set_perm will set shared_perm to
blk->shared_perm, which is BLK_PERM_ALL.  And it makes
raw_handle_perm_lock not to get the write lock.

So I try the following patch and it will fix the problem:
diff --git a/block/block-backend.c b/block/block-backend.c
index 12ef80ea17..96518fd1f0 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -197,13 +197,6 @@ static void blk_root_activate(BdrvChild *child,
Error **errp)

      blk->disable_perm = false;

-    blk_set_perm(blk, blk->perm, BLK_PERM_ALL, &local_err);
-    if (local_err) {
-        error_propagate(errp, local_err);
-        blk->disable_perm = true;
-        return;
-    }
-
      if (runstate_check(RUN_STATE_INMIGRATE)) {
          /* Activation can happen when migration process is still
active, for
           * example when nbd_server_add is called during non-shared storage

I'm new to the block layer and I'm not sure that it's a right fix to the
problem.  Any idea about the problem and the patch?

Hi Peng,

Thanks for your report!  I can reproduce this problem.

It appears to me the problem is that blk_set_perm(), well, sets blk->perm and blk->shared_perm.  So by once calling it with BLK_PERM_ALL, we override blk->shared_perm to from then on be BLK_PERM_ALL, even though the guest device has not set that at all. We later call blk_set_perm(blk->blk_perm, blk->shared_perm) (in blk_vm_state_changed()), however, blk->shared_perm is now BLK_PERM_ALL, so this is a no-op.  That means that any restrictions the guest device has imposed (like the default share-rw=off) is not reflected in the block device’s permissions.

This is not apparent with qcow2, because the qcow2 format imposes its own restrictions in addition to the guest device.

I think the right way to fix this is to save blk->shared_perm somewhere and then restore it after the blk_set_perm(BLK_PERM_ALL) call.  I’ll send a patch (with a test case).

Hanna




reply via email to

[Prev in Thread] Current Thread [Next in Thread]