[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RBD images and exclusive locking
From: |
Peter Lieven |
Subject: |
Re: RBD images and exclusive locking |
Date: |
Fri, 25 Mar 2022 12:35:01 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 |
Am 24.03.22 um 15:53 schrieb Peter Lieven:
> Am 24.03.22 um 12:51 schrieb Peter Lieven:
>> Hi,
>>
>>
>> following the thread on the ceph ml about rbd and exclusive locks I was
>> wondering if we should rewrite the rbd driver to
>>
>> try to acquire an exclusive lock on open and not magically on the first
>> write. This way we could directly terminate qemu
>>
>> if the rbd image is in use like it is done with ordinary image files on a
>> filesystem for some time now.
>
>
> Digging a bit into the code and testing it seems that the exclusive-lock on
> rbd image is not that exclusive. We can easily start several
>
> qemu instances accessing the same image. Watching the locks on the image, the
> lock owner of the one exclusive lock
>
> toggles between the instances. This has potentitial for severe corruption.
>
> I am thinking not of the case where a qemu instance gets killed or host dies.
> I am thinking of network issues where the disconnected
>
> old instance of a VM suddenly kicks back in.
>
>
> However, if I manually acquire an exclusive lock on qemu_rbd_open_image all
> other callers get EROFS.
>
> So whats the difference between the exclusive lock that a write request
> obtains and an exclusive lock created
>
> by rbd_lock_acquire? From the rbd lock ls perspective they are identical.
>
>
> Best
>
> Peter
>
To whom it may concern this is my proof of concept code to add safe locking.
Tested with live migration incl. migration from an old version without the
locking support.
rbd_lock_acquire can take the lock that has been installed on the first write.
Considerations:
1) add switch to enable/disable locking if someone mounts the image multiple
times on purpose (default on?)
2) we might need to check if we already have the lock in bdrv_invalidate_cache
3) I am aware of bdrv_{check,set,abort}_perm. If this are the better hooks how
to use them with only a single lock?
4) rbd_lock_release is automatically invoked on image close.
Happy weekend
Peter
---8<---
diff --git a/block/rbd.c b/block/rbd.c
index 240b267c27..13a597e526 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1047,6 +1047,24 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict
*options, int flags,
goto failed_open;
}
+ if (!(bdrv_get_flags(bs) & BDRV_O_INACTIVE)) {
+ error_report("qemu_rbd_open: rbd_lock_acquire");
+ r = rbd_lock_acquire(s->image, RBD_LOCK_MODE_EXCLUSIVE);
+ if (r < 0) {
+ if (r == -EROFS) {
+ error_setg(errp, "failed to acquire excl lock for %s",
+ s->image_name);
+ error_append_hint(errp,
+ "Is another process using the image?\n");
+ } else {
+ error_setg_errno(errp, -r, "failed to acquire excl lock for
%s",
+ s->image_name);
+ }
+ goto failed_post_open;
+ }
+ error_report("qemu_rbd_open: rbd_lock_acquire success");
+ }
+
r = rbd_stat(s->image, &info, sizeof(info));
if (r < 0) {
error_setg_errno(errp, -r, "error getting image info from %s",
@@ -1266,17 +1284,50 @@ static int qemu_rbd_snap_list(BlockDriverState *bs,
return snap_count;
}
-#ifdef LIBRBD_SUPPORTS_INVALIDATE
static void coroutine_fn qemu_rbd_co_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
- int r = rbd_invalidate_cache(s->image);
+ int r;
+
+ if (!(bdrv_get_flags(bs) & BDRV_O_INACTIVE)) {
+ error_report("qemu_rbd_co_invalidate_cache: rbd_lock_acquire");
+ r = rbd_lock_acquire(s->image, RBD_LOCK_MODE_EXCLUSIVE);
+ if (r < 0) {
+ if (r == -EROFS) {
+ error_setg(errp, "failed to acquire excl lock for %s",
+ s->image_name);
+ error_append_hint(errp,
+ "Is another process using the image?\n");
+ } else {
+ error_setg_errno(errp, -r, "failed to acquire excl lock for
%s",
+ s->image_name);
+ }
+ return;
+ }
+ error_report("qemu_rbd_co_invalidate_cache: rbd_lock_acquire success");
+ }
+
+ r = rbd_invalidate_cache(s->image);
if (r < 0) {
error_setg_errno(errp, -r, "Failed to invalidate the cache");
}
}
-#endif
+
+static int qemu_rbd_inactivate(BlockDriverState *bs)
+{
+ BDRVRBDState *s = bs->opaque;
+ int r;
+ error_report("qemu_rbd_inactivate: rbd_lock_release");
+ r = rbd_lock_release(s->image);
+ if (r < 0) {
+ error_report("failed to release lock for %s (%s)", s->image_name,
+ strerror(-r));
+ } else {
+ error_report("qemu_rbd_inactivate: rbd_lock_release success");
+ }
+ return r;
+}
static QemuOptsList qemu_rbd_create_opts = {
.name = "rbd-create-opts",
@@ -1338,11 +1389,12 @@ static BlockDriver bdrv_rbd = {
.bdrv_co_pwrite_zeroes = qemu_rbd_co_pwrite_zeroes,
.bdrv_co_block_status = qemu_rbd_co_block_status,
- .bdrv_snapshot_create = qemu_rbd_snap_create,
- .bdrv_snapshot_delete = qemu_rbd_snap_remove,
- .bdrv_snapshot_list = qemu_rbd_snap_list,
- .bdrv_snapshot_goto = qemu_rbd_snap_rollback,
+ .bdrv_snapshot_create = qemu_rbd_snap_create,
+ .bdrv_snapshot_delete = qemu_rbd_snap_remove,
+ .bdrv_snapshot_list = qemu_rbd_snap_list,
+ .bdrv_snapshot_goto = qemu_rbd_snap_rollback,
.bdrv_co_invalidate_cache = qemu_rbd_co_invalidate_cache,
+ .bdrv_inactivate = qemu_rbd_inactivate,
.strong_runtime_opts = qemu_rbd_strong_runtime_opts,
};
--->8---
Re: RBD images and exclusive locking, Ilya Dryomov, 2022/03/25