[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device
From: |
Stefan Hajnoczi |
Subject: |
Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device |
Date: |
Mon, 13 Jan 2020 16:38:55 +0000 |
On Thu, Jan 09, 2020 at 12:58:06PM +0800, Eryu Guan wrote:
> On Tue, Jan 07, 2020 at 03:01:01PM +0100, Julia Suvorova wrote:
> > On Tue, Jan 7, 2020 at 2:06 PM Eryu Guan <address@hidden> wrote:
> > >
> > > On Thu, Jan 02, 2020 at 10:08:50AM +0800, Eryu Guan wrote:
> > > > On Tue, Dec 31, 2019 at 11:51:35AM +0100, Igor Mammedov wrote:
> > > > > On Tue, 31 Dec 2019 18:34:34 +0800
> > > > > Eryu Guan <address@hidden> wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I'm using qemu 4.0 and hit segfault when tearing down kata sandbox,
> > > > > > I
> > > > > > think it's because io completion hits use-after-free when device is
> > > > > > already gone. Is this a known bug that has been fixed? (I went
> > > > > > through
> > > > > > the git log but didn't find anything obvious).
> > > > > >
> > > > > > gdb backtrace is:
> > > > > >
> > > > > > Core was generated by `/usr/local/libexec/qemu-kvm -name
> > > > > > sandbox-5b8df8c6c6901c3c0a9b02879be10fe8d69d6'.
> > > > > > Program terminated with signal 11, Segmentation fault.
> > > > > > #0 object_get_class (obj=obj@entry=0x0) at
> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
> > > > > > 903 return obj->class;
> > > > > > (gdb) bt
> > > > > > #0 object_get_class (obj=obj@entry=0x0) at
> > > > > > /usr/src/debug/qemu-4.0/qom/object.c:903
> > > > > > #1 0x0000558a2c009e9b in virtio_notify_vector (vdev=0x558a2e7751d0,
> > > > > > vector=<optimized out>) at
> > > > > > /usr/src/debug/qemu-4.0/hw/virtio/virtio.c:1118
> > > > > > #2 0x0000558a2bfdcb1e in virtio_blk_discard_write_zeroes_complete (
> > > > > > opaque=0x558a2f2fd420, ret=0)
> > > > > > at /usr/src/debug/qemu-4.0/hw/block/virtio-blk.c:186
> > > > > > #3 0x0000558a2c261c7e in blk_aio_complete (acb=0x558a2eed7420)
> > > > > > at /usr/src/debug/qemu-4.0/block/block-backend.c:1305
> > > > > > #4 0x0000558a2c3031db in coroutine_trampoline (i0=<optimized out>,
> > > > > > i1=<optimized out>) at
> > > > > > /usr/src/debug/qemu-4.0/util/coroutine-ucontext.c:116
> > > > > > #5 0x00007f45b2f8b080 in ?? () from /lib64/libc.so.6
> > > > > > #6 0x00007fff9ed75780 in ?? ()
> > > > > > #7 0x0000000000000000 in ?? ()
> > > > > >
> > > > > > It seems like qemu was completing a discard/write_zero request, but
> > > > > > parent BusState was already freed & set to NULL.
> > > > > >
> > > > > > Do we need to drain all pending request before unrealizing
> > > > > > virtio-blk
> > > > > > device? Like the following patch proposed?
> > > > > >
> > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg02945.html
> > > > > >
> > > > > > If more info is needed, please let me know.
> > > > >
> > > > > may be this will help: https://patchwork.kernel.org/patch/11213047/
> > > >
> > > > Yeah, this looks promising! I'll try it out (though it's a one-time
> > > > crash for me). Thanks!
> > >
> > > After applying this patch, I don't see the original segfaut and
> > > backtrace, but I see this crash
> > >
> > > [Thread debugging using libthread_db enabled]
> > > Using host libthread_db library "/lib64/libthread_db.so.1".
> > > Core was generated by `/usr/local/libexec/qemu-kvm -name
> > > sandbox-a2f34a11a7e1449496503bbc4050ae040c0d3'.
> > > Program terminated with signal 11, Segmentation fault.
> > > #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
> > > addr=0, val=<optimized out>, size=<optimized out>) at
> > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
> > > 1324 VirtIOPCIProxy *proxy =
> > > VIRTIO_PCI(DEVICE(vdev)->parent_bus->parent);
> > > Missing separate debuginfos, use: debuginfo-install
> > > glib2-2.42.2-5.1.alios7.x86_64 glibc-2.17-260.alios7.x86_64
> > > libgcc-4.8.5-28.alios7.1.x86_64 libseccomp-2.3.1-3.alios7.x86_64
> > > libstdc++-4.8.5-28.alios7.1.x86_64 numactl-libs-2.0.9-5.1.alios7.x86_64
> > > pixman-0.32.6-3.1.alios7.x86_64 zlib-1.2.7-16.2.alios7.x86_64
> > > (gdb) bt
> > > #0 0x0000561216a57609 in virtio_pci_notify_write (opaque=0x5612184747e0,
> > > addr=0, val=<optimized out>, size=<optimized out>) at
> > > /usr/src/debug/qemu-4.0/hw/virtio/virtio-pci.c:1324
> > > #1 0x0000561216835b22 in memory_region_write_accessor (mr=<optimized
> > > out>, addr=<optimized out>, value=<optimized out>, size=<optimized out>,
> > > shift=<optimized out>, mask=<optimized out>, attrs=...) at
> > > /usr/src/debug/qemu-4.0/memory.c:502
> > > #2 0x0000561216833c5d in access_with_adjusted_size (addr=addr@entry=0,
> > > value=value@entry=0x7fcdeab1b8a8, size=size@entry=2,
> > > access_size_min=<optimized out>, access_size_max=<optimized out>,
> > > access_fn=0x561216835ac0 <memory_region_write_accessor>,
> > > mr=0x56121846d340, attrs=...)
> > > at /usr/src/debug/qemu-4.0/memory.c:568
> > > #3 0x0000561216837c66 in memory_region_dispatch_write
> > > (mr=mr@entry=0x56121846d340, addr=0, data=<optimized out>, size=2,
> > > attrs=attrs@entry=...) at /usr/src/debug/qemu-4.0/memory.c:1503
> > > #4 0x00005612167e036f in flatview_write_continue
> > > (fv=fv@entry=0x56121852edd0, addr=addr@entry=841813602304, attrs=...,
> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
> > > len=len@entry=2, addr1=<optimized out>, l=<optimized out>,
> > > mr=0x56121846d340)
> > > at /usr/src/debug/qemu-4.0/exec.c:3279
> > > #5 0x00005612167e0506 in flatview_write (fv=0x56121852edd0,
> > > addr=841813602304, attrs=..., buf=0x7fce7dd97028 <Address 0x7fce7dd97028
> > > out of bounds>, len=2) at /usr/src/debug/qemu-4.0/exec.c:3318
> > > #6 0x00005612167e4a1b in address_space_write (as=<optimized out>,
> > > addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized
> > > out>) at /usr/src/debug/qemu-4.0/exec.c:3408
> > > #7 0x00005612167e4aa5 in address_space_rw (as=<optimized out>,
> > > addr=<optimized out>, attrs=..., attrs@entry=...,
> > > buf=buf@entry=0x7fce7dd97028 <Address 0x7fce7dd97028 out of bounds>,
> > > len=<optimized out>, is_write=<optimized out>) at
> > > /usr/src/debug/qemu-4.0/exec.c:3419
> > > #8 0x0000561216849da1 in kvm_cpu_exec (cpu=cpu@entry=0x56121849aa00) at
> > > /usr/src/debug/qemu-4.0/accel/kvm/kvm-all.c:2034
> > > #9 0x000056121682255e in qemu_kvm_cpu_thread_fn
> > > (arg=arg@entry=0x56121849aa00) at /usr/src/debug/qemu-4.0/cpus.c:1281
> > > #10 0x0000561216b794d6 in qemu_thread_start (args=<optimized out>) at
> > > /usr/src/debug/qemu-4.0/util/qemu-thread-posix.c:502
> > > #11 0x00007fce7bef6e25 in start_thread () from /lib64/libpthread.so.0
> > > #12 0x00007fce7bc1ef1d in clone () from /lib64/libc.so.6
> > >
> > > And I searched and found
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1706759 , which has the same
> > > backtrace as above, and it seems commit 7bfde688fb1b ("virtio-blk: Add
> > > blk_drain() to virtio_blk_device_unrealize()") is to fix this particular
> > > bug.
> > >
> > > But I can still hit the bug even after applying the commit. Do I miss
> > > anything?
> >
> > Hi Eryu,
> > This backtrace seems to be caused by this bug (there were two bugs in
> > 1706759): https://bugzilla.redhat.com/show_bug.cgi?id=1708480
> > Although the solution hasn't been tested on virtio-blk yet, you may
> > want to apply this patch:
> > https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg05197.html
> > Let me know if this works.
>
> Unfortunately, I still see the same segfault & backtrace after applying
> commit 421afd2fe8dd ("virtio: reset region cache when on queue
> deletion")
>
> Anything I can help to debug?
Please post the QEMU command-line and the QMP commands use to remove the
device.
The backtrace shows a vcpu thread submitting a request. The device
seems to be partially destroyed. That's surprising because the monitor
and the vcpu thread should use the QEMU global mutex to avoid race
conditions. Maybe seeing the QMP commands will make it clearer...
Stefan
signature.asc
Description: PGP signature
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Eryu Guan, 2020/01/01
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Eryu Guan, 2020/01/07
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Julia Suvorova, 2020/01/07
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Eryu Guan, 2020/01/07
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Eryu Guan, 2020/01/08
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device,
Stefan Hajnoczi <=
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Eryu Guan, 2020/01/13
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Stefan Hajnoczi, 2020/01/14
- Re: [BUG qemu 4.0] segfault when unplugging virtio-blk-pci device, Eryu Guan, 2020/01/19