qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [Bug 1754542] Re: colo: vm crash with segmentation fault


From: WANG Chao
Subject: [Qemu-devel] [Bug 1754542] Re: colo: vm crash with segmentation fault
Date: Fri, 27 Jul 2018 04:53:07 -0000

Hi, Zhang Chen

It seems virtio blk isn't working.

I test coloft against https://github.com/zhangckid/qemu/tree/qemu-colo-
18jul22, got the following error on very early stage:

On primary:
qemu-system-x86_64: Can't receive COLO message: Input/output error

On secondary:
qemu-system-x86_64: block.c:4893: bdrv_detach_aio_context: Assertion 
`!bs->walking_aio_notifiers' failed.

Run the test as follows:

1. Setup primary:

# qemu-img create -b centos6base.img -f qcow2 centos6sp.img
# qemu-system-x86_64 -machine dump-guest-core=off -accel kvm -m 128 \
-smp 2 -name primary -serial stdio \
-qmp unix://root/wangchao/pvm.monitor.sock,server,nowait -vnc :10 \
-netdev tap,id=hn0,vhost=off,script=no,downscript=no -drive \
if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=/root/wangchao/images/centos6sp.img,children.0.driver=qcow2
 \
-S -nodefaults

2. Setup secondary:

# qemu-img create -b centos6base.img -f qcow2 centos6sp.img
# qemu-img create -f qcow2 /dev/shm/active.img 20G
# qemu-img create -f qcow2 /dev/shm/hidden.img 20G
# qemu-system-x86_64 -machine dump-guest-core=off -accel kvm -m 128 \
-smp 2 -name secondary -serial stdio \
-qmp unix://root/wangchao/svm.monitor.sock,server,nowait -vnc :10 \
-netdev tap,id=hn0,vhost=off,script=no,downscript=no \
-drive 
if=none,id=secondary-disk0,file.filename=/root/wangchao/images/centos6sp.img,driver=qcow2,node-name=node0
 \
-drive 
if=virtio,id=active-disk0,driver=replication,mode=secondary,top-id=active-disk0,file.driver=qcow2,file.file.filename=/dev/shm/active.img,file.backing.driver=qcow2,file.backing.file.filename=/dev/shm/hidden.img,file.backing.backing=secondary-disk0
 \
-incoming tcp:0:8888 -nodefaults

3. Issue the following qmp:

On secondary:
{'execute':'qmp_capabilities'}
{'execute': 'nbd-server-start', 'arguments': {'addr': {'type': 'inet', 'data': 
{'host': 'x.x.x.x', 'port': '8889'} } } }
{'execute': 'nbd-server-add', 'arguments': {'device':'secondary-disk0', 
'writable': true } }

On primary:
{'execute': 'qmp_capabilities'}
{'execute': 'human-monitor-command', 'arguments': {'command-line': 'drive_add 
-n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=x.x.x.x,file.port=8889,file.export=secondary-disk0,node-name=nbd_client0'}}
{'execute': 'x-blockdev-change', 'arguments':{'parent': 'primary-disk0', 
'node': 'nbd_client0' } }
{'execute': 'migrate-set-capabilities', 'arguments': {'capabilities': 
[{'capability': 'x-colo', 'state': true } ] } }
{'execute': 'migrate', 'arguments': {'uri': 'tcp:x.x.x.x:8888'}}

4. Then secondary immediately crashed:
qemu-system-x86_64: block.c:4893: bdrv_detach_aio_context: Assertion 
`!bs->walking_aio_notifiers' failed.

(gdb) bt
#0  0x00007fb50d241277 in raise () from /lib64/libc.so.6
#1  0x00007fb50d242968 in abort () from /lib64/libc.so.6
#2  0x00007fb50d23a096 in __assert_fail_base () from /lib64/libc.so.6
#3  0x00007fb50d23a142 in __assert_fail () from /lib64/libc.so.6
#4  0x0000000000706ae9 in bdrv_detach_aio_context (bs=0x2e84000) at block.c:4893
#5  0x0000000000706ab8 in bdrv_detach_aio_context (address@hidden) at 
block.c:4911
#6  0x0000000000706c16 in bdrv_set_aio_context (bs=0x315d400, 
new_context=0x2e17180) at block.c:4960
#7  0x000000000070a43d in block_job_attached_aio_context 
(new_context=<optimized out>, opaque=0x2d92000) at blockjob.c:111
#8  0x0000000000706b93 in bdrv_attach_aio_context (bs=0x2e84000, 
address@hidden) at block.c:4942
#9  0x0000000000706b2b in bdrv_attach_aio_context (bs=0x315d400, 
address@hidden) at block.c:4930
#10 0x0000000000706b2b in bdrv_attach_aio_context (bs=0x2ff8800, 
address@hidden) at block.c:4930
#11 0x0000000000706b2b in bdrv_attach_aio_context (address@hidden, 
address@hidden) at block.c:4930
#12 0x0000000000706c29 in bdrv_set_aio_context (bs=0x2ff5400, 
new_context=0x2e17180) at block.c:4966
#13 0x0000000000748a17 in blk_set_aio_context (blk=<optimized out>, 
new_context=<optimized out>) at block/block-backend.c:1894
#14 0x000000000049b60a in virtio_blk_data_plane_start (vdev=<optimized out>) at 
/root/wangchao/qemu-colo-18jul22/hw/block/dataplane/virtio-blk.c:215
#15 0x000000000069ceda in virtio_bus_start_ioeventfd (address@hidden) at 
hw/virtio/virtio-bus.c:223
#16 0x00000000006a2480 in virtio_pci_start_ioeventfd (proxy=0x4620000) at 
hw/virtio/virtio-pci.c:288
#17 virtio_pci_common_write (opaque=0x4620000, addr=<optimized out>, 
val=<optimized out>, size=<optimized out>) at hw/virtio/virtio-pci.c:1288
#18 0x00000000004673b8 in memory_region_write_accessor (mr=0x46209d0, addr=20, 
value=<optimized out>, size=1, shift=<optimized out>, mask=<optimized out>, 
attrs=...) at /root/wangchao/qemu-colo-18jul22/memory.c:527
#19 0x0000000000466c63 in access_with_adjusted_size (address@hidden, 
address@hidden, address@hidden, access_size_min=<optimized out>, 
access_size_max=<optimized out>, address@hidden <memory_region_write_accessor>, 
address@hidden, address@hidden) at /root/wangchao/qemu-colo-18jul22/memory.c:594
#20 0x0000000000469388 in memory_region_dispatch_write (address@hidden, 
address@hidden, data=15, size=1, address@hidden) at 
/root/wangchao/qemu-colo-18jul22/memory.c:1473
#21 0x000000000041ada0 in flatview_write_continue (address@hidden, 
address@hidden, attrs=..., address@hidden, address@hidden <Address 
0x7fb51019a028 out of bounds>, address@hidden, addr1=20, l=1,mr=0x46209d0) at 
/root/wangchao/qemu-colo-18jul22/exec.c:3255
#22 0x000000000041af62 in flatview_write (fv=0x459ec80, addr=4273963028, 
attrs=..., buf=0x7fb51019a028 <Address 0x7fb51019a028 out of bounds>, len=1) at 
/root/wangchao/qemu-colo-18jul22/exec.c:3294

It seems we were trying to do another aio_notifiers walk *inside a
aio_notifiers walk*.

Thanks
WANG Chao

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1754542

Title:
  colo:  vm crash with segmentation fault

Status in QEMU:
  New

Bug description:
  I use Arch Linux x86_64
  Zhang Chen's(https://github.com/zhangckid/qemu/tree/qemu-colo-18mar10)
  Following document 'COLO-FT.txt',
  I test colo feature on my hosts

  I run this command
  Primary:
  sudo /usr/local/bin/qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -qmp stdio 
-name primary \
  -device piix3-usb-uhci \
  -device usb-tablet -netdev tap,id=hn0,vhost=off \
  -device virtio-net-pci,id=net-pci0,netdev=hn0 \
  -drive 
if=virtio,id=primary-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
  children.0.file.filename=/var/lib/libvirt/images/1.raw,\
  children.0.driver=raw -S

  Secondary:
  sudo /usr/local/bin/qemu-system-x86_64 -enable-kvm -m 2048 -smp 2 -qmp stdio 
-name secondary \
  -device piix3-usb-uhci \
  -device usb-tablet -netdev tap,id=hn0,vhost=off \
  -device virtio-net-pci,id=net-pci0,netdev=hn0 \
  -drive 
if=none,id=secondary-disk0,file.filename=/var/lib/libvirt/images/2.raw,driver=raw,node-name=node0
 \
  -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
  file.driver=qcow2,top-id=active-disk0,\
  file.file.filename=/mnt/ramfs/active_disk.img,\
  file.backing.driver=qcow2,\
  file.backing.file.filename=/mnt/ramfs/hidden_disk.img,\
  file.backing.backing=secondary-disk0 \
  -incoming tcp:0:8888

  Secondary:
  {'execute':'qmp_capabilities'}
  { 'execute': 'nbd-server-start',
    'arguments': {'addr': {'type': 'inet', 'data': {'host': '192.168.0.34', 
'port': '8889'} } }
  }
  {'execute': 'nbd-server-add', 'arguments': {'device': 'secondary-disk0', 
'writable': true } }

  Primary:
  {'execute':'qmp_capabilities'}
  { 'execute': 'human-monitor-command',
    'arguments': {'command-line': 'drive_add -n buddy 
driver=replication,mode=primary,file.driver=nbd,file.host=192.168.0.34,file.port=8889,file.export=secondary-disk0,node-name=nbd_client0'}}
  { 'execute':'x-blockdev-change', 'arguments':{'parent': 'primary-disk0', 
'node': 'nbd_client0' } }
  { 'execute': 'migrate-set-capabilities',
        'arguments': {'capabilities': [ {'capability': 'x-colo', 'state': true 
} ] } }
  { 'execute': 'migrate', 'arguments': {'uri': 'tcp:192.168.0.34:8888' } }
  And two VM with cash
  Primary:
  {"timestamp": {"seconds": 1520763655, "microseconds": 511415}, "event": 
"RESUME"}
  [1]    329 segmentation fault  sudo /usr/local/bin/qemu-system-x86_64 -boot c 
-enable-kvm -m 2048 -smp 2 -qm

  Secondary:
  {"timestamp": {"seconds": 1520763655, "microseconds": 510907}, "event": 
"RESUME"}
  [1]    367 segmentation fault  sudo /usr/local/bin/qemu-system-x86_64 -boot c 
-enable-kvm -m 2048 -smp 2 -qm

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1754542/+subscriptions



reply via email to

[Prev in Thread] Current Thread [Next in Thread]