qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] hw/nvme: Fix deallocate when metadata is present


From: Jonathan Derrick
Subject: [PATCH] hw/nvme: Fix deallocate when metadata is present
Date: Fri, 3 Jun 2022 13:14:40 -0600

When metadata is present in the namespace and deallocates are issued, the first
deallocate could fail to zero the block range, resulting in another
deallocation to be issued. Normally after the deallocation completes and the
range is checked for zeroes, a deallocation is then issued for the metadata
space. In the failure case where the range is not zeroed, deallocation is
reissued for the block range (and followed with metadata deallocation), but the
original range deallocation task will also issue a metadata deallocation:

nvme_dsm_cb()
  *range deallocation*
  nvme_dsm_md_cb()
    if (nvme_block_status_all()) (range deallocation failure)
      nvme_dsm_cb()
      *range deallocation*
        nvme_dsm_md_cb()
          if (nvme_block_status_all()) (no failure)
          *metadata deallocation*
    *metadata deallocation*

This sequence results in reentry of nvme_dsm_cb() before the metadata has been
deallocated. During reentry, the metadata is deallocated in the reentrant task.
nvme_dsm_bh() is called which deletes and sets iocb->bh to NULL. When reentry
returns from nvme_dsm_cb(), metadata deallocation takes place again, and
results in a null pointer dereference on the iocb->bh:

BH deletion:
#0  nvme_dsm_bh (opaque=0x55ef893e2f10) at ../hw/nvme/ctrl.c:2316
#1  0x000055ef868eb333 in aio_bh_call (bh=0x55ef8a441b30) at ../util/async.c:141
#2  0x000055ef868eb441 in aio_bh_poll (ctx=0x55ef892c6e40) at 
../util/async.c:169
#3  0x000055ef868d2789 in aio_dispatch (ctx=0x55ef892c6e40) at 
../util/aio-posix.c:415
#4  0x000055ef868eb896 in aio_ctx_dispatch (source=0x55ef892c6e40, 
callback=0x0, user_data=0x0) at ../util/async.c:311
#5  0x00007f5bfe4ab17d in g_main_context_dispatch () at 
/lib/x86_64-linux-gnu/libglib-2.0.so.0
#6  0x000055ef868fcd98 in glib_pollfds_poll () at ../util/main-loop.c:232
#7  0x000055ef868fce16 in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:255
#8  0x000055ef868fcf27 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:531
#9  0x000055ef864a2442 in qemu_main_loop () at ../softmmu/runstate.c:726
#10 0x000055ef860f957a in main (argc=29, argv=0x7ffdc9705508, 
envp=0x7ffdc97055f8) at ../softmmu/main.c:50

nvme_dsm_cb() called for metadata after nvme_dsm_bh() completes from reentrant 
task:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x000055ef868eb07c in aio_bh_enqueue (bh=0x0, new_flags=2) at ../util/async.c:70
70          AioContext *ctx = bh->ctx;
(gdb) backtrace
#0  0x000055ef868eb07c in aio_bh_enqueue (bh=0x0, new_flags=2) at 
../util/async.c:70
#1  0x000055ef868eb4cf in qemu_bh_schedule (bh=0x0) at ../util/async.c:186
#2  0x000055ef862db21e in nvme_dsm_cb (opaque=0x55ef897b41a0, ret=0) at 
../hw/nvme/ctrl.c:2423
#3  0x000055ef8665a662 in blk_aio_complete (acb=0x55ef89c6d8c0) at 
../block/block-backend.c:1419
#4  0x000055ef8665a940 in blk_aio_write_entry (opaque=0x55ef89c6d8c0) at 
../block/block-backend.c:1486
#5  0x000055ef868edcf2 in coroutine_trampoline (i0=-536848976, i1=32602) at 
../util/coroutine-ucontext.c:173
#6  0x00007f5bfe0bc510 in __start_context () at 
../sysdeps/unix/sysv/linux/x86_64/__start_context.S:91
#7  0x00007f5bf757bb40 in  ()
#8  0x0000000000000000 in  ()

The fix is to return when an nvme_dsm_cb() is reentered due to failure to
deallocate the block range, so that metadata deallocate is then only issued in
the reentrant task and prevent doing it again when the reentrant task returns
to the original task.

Reproduction steps (with emulated namespace):
nvme format --lbaf=1 -f /dev/nvme0n1
mkfs.ext4 /dev/nvme0n1
mkfs.ext4 -F /dev/nvme0n1

Signed-off-by: Francis Pravin AntonyX Michael Raj <francis.michael@solidigm.com>
Signed-off-by: Michael Kropaczek <michael.kropaczek@solidigm.com>
Signed-off-by: Jonathan Derrick <jonathan.derrick@solidigm.com>
---
 hw/nvme/ctrl.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 03760ddeae..74540a03d5 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -2372,6 +2372,7 @@ static void nvme_dsm_md_cb(void *opaque, int ret)
         }
 
         nvme_dsm_cb(iocb, 0);
+        return;
     }
 
     iocb->aiocb = blk_aio_pwrite_zeroes(ns->blkconf.blk, nvme_moff(ns, slba),
-- 
2.25.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]