qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The problems about COLO


From: Daniel Cho
Subject: Re: The problems about COLO
Date: Fri, 8 Nov 2019 11:43:17 +0800

Lukas Straub <address@hidden> 於 2019年11月7日 週四 下午9:34寫道:
On Thu, 7 Nov 2019 16:14:43 +0800
Daniel Cho <address@hidden> wrote:

> Hi  Lukas,
> Thanks for your reply.
>
> However, we test the question 1 with steps below the error message, we
> notice the secondary VM's image
> will break  while it reboots.
> Here is the error message.
> -------------------------------------------------------------------
> [    1.280299] XFS (sda1): Mounting V5 Filesystem
> [    1.428418] input: ImExPS/2 Generic Explorer Mouse as
> /devices/platform/i8042/serio1/input/input2
> [    1.501320] XFS (sda1): Starting recovery (logdev: internal)
> [    1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz
> [    1.505534] Switched to clocksource tsc
> [    2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line
> 1635 of file fs/xfs/libxfs/xfs_alloc.c.  Caller xfs_free_extent+0xfc/0x130
> [xfs]
> [    2.032743] CPU: 0 PID: 300 Comm: mount Not tainted
> 3.10.0-693.11.6.el7.x86_64 #1
> [    2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [    2.035882] Call Trace:
> [    2.036494]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
> [    2.037315]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
> [    2.038150]  [<ffffffffc0138e6c>] ? xfs_free_extent+0xfc/0x130 [xfs]
> [    2.039046]  [<ffffffffc01362da>] xfs_free_ag_extent+0x20a/0x780 [xfs]
> [    2.039920]  [<ffffffffc0138e6c>] xfs_free_extent+0xfc/0x130 [xfs]
> [    2.040768]  [<ffffffffc01a7736>] xfs_trans_free_extent+0x26/0x60 [xfs]
> [    2.041642]  [<ffffffffc019fade>] xlog_recover_process_efi+0x17e/0x1c0
> [xfs]
> [    2.042558]  [<ffffffffc01a1e37>]
> xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> [    2.043771]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
> [    2.044650]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
> [    2.045518]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
> [    2.046341]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80
> [xfs]
> [    2.047260]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> [    2.048116]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
> [    2.048881]  [<ffffffffc01919b0>] ?
> xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> [    2.050105]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
> [    2.050906]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
> [    2.051963]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
> [    2.059431]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
> [    2.060283]  [<ffffffff81226483>] do_mount+0x233/0xaf0
> [    2.061081]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
> [    2.061844]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
> [    2.062619]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> [    2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of
> file fs/xfs/xfs_trans.c.  Caller xlog_recover_process_efi+0x18e/0x1c0 [xfs]
> [    2.065260] CPU: 0 PID: 300 Comm: mount Not tainted
> 3.10.0-693.11.6.el7.x86_64 #1
> [    2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014
> [    2.068023] Call Trace:
> [    2.068590]  [<ffffffff816a5ea1>] dump_stack+0x19/0x1b
> [    2.069403]  [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs]
> [    2.070318]  [<ffffffffc019faee>] ? xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> [    2.071538]  [<ffffffffc019594d>] xfs_trans_cancel+0xbd/0xe0 [xfs]
> [    2.072429]  [<ffffffffc019faee>] xlog_recover_process_efi+0x18e/0x1c0
> [xfs]
> [    2.073339]  [<ffffffffc01a1e37>]
> xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs]
> [    2.074561]  [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs]
> [    2.075421]  [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs]
> [    2.076301]  [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs]
> [    2.077128]  [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80
> [xfs]
> [    2.078049]  [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs]
> [    2.078900]  [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0
> [    2.079667]  [<ffffffffc01919b0>] ?
> xfs_test_remount_options.isra.11+0x70/0x70 [xfs]
> [    2.080883]  [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs]
> [    2.081687]  [<ffffffff81207349>] mount_fs+0x39/0x1b0
> [    2.082457]  [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20
> [    2.083258]  [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110
> [    2.084057]  [<ffffffff81226483>] do_mount+0x233/0xaf0
> [    2.084797]  [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0
> [    2.085568]  [<ffffffff812270c6>] SyS_mount+0x96/0xf0
> [    2.086324]  [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> [    2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 985
> of file fs/xfs/xfs_trans.c.  Return address = 0xffffffffc0195966
> [    2.088795] XFS (sda1): Corruption of in-memory data detected.  Shutting
> down filesystem
> [    2.090273] XFS (sda1): Please umount the filesystem and rectify the
> problem(s)
> [    2.091519] XFS (sda1): Failed to recover EFIs
> [    2.092299] XFS (sda1): log mount finish failed
> [FAILED] Failed to mount /sysroot.
> .
> .
> .
> Generating "/run/initramfs/rdsosreport.txt"
> [    2.178103] blk_update_request: I/O error, dev fd0, sector 0
> [    2.246106] blk_update_request: I/O error, dev fd0, sector 0
>   -------------------------------------------------------------------
>
> Here is the replicated steps:
> *1. Start primary VM with command, and do every thing you want on PVM*
>         qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> stdio -vnc :5 \
>   -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \
>   -netdev
> tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
>   -device rtl8139,id=e0,netdev=hn0 \
>   -drive
> if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=$image_path,children.0.driver=qcow2
> *2. Add the device and object to PVM with qmp command*
>       {'execute':'qmp_capabilities'}
>       {"execute":"chardev-add", "arguments":{ "id" : "mirror0", "backend" :
> { "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9003" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare1", "backend"
> : { "type" : "socket", "data" : { "server": true, "wait": true, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9004" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare0", "backend"
> : { "type" : "socket", "data" : { "server": true, "wait": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare0-0",
> "backend" : { "type" : "socket", "data" : { "server": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare_out",
> "backend" : { "type" : "socket", "data" : { "server": true, "wait": false,
> "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } }
> } } }}
>       {"execute":"chardev-add", "arguments":{ "id" : "compare_out0",
> "backend" : { "type" : "socket", "data" : { "server": false, "addr": {
> "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } } }
>       {"execute":"object-add", "arguments":{ "qom-type" : "filter-mirror",
> "id" : "m0", "props": { "netdev": "hn0", "outdev" : "mirror0", "queue" :
> "tx" } } }
>       {"execute":"object-add", "arguments":{ "qom-type" :
> "filter-redirector", "id" : "redire0", "props": { "netdev": "hn0", "indev"
> : "compare_out", "queue" : "rx" } } }
>       {"execute":"object-add", "arguments":{ "qom-type" :
> "filter-redirector", "id" : "redire1", "props": { "netdev": "hn0", "outdev"
> : "compare0", "queue" : "rx" } } }
>       {"execute":"object-add", "arguments":{ "qom-type" : "iothread", "id"
> : "iothread1", "props": {} } }
>       {"execute":"object-add", "arguments":{ "qom-type" : "colo-compare",
> "id" : "comp0", "props": { "primary_in" : "compare0-0", "secondary_in" :
> "compare1", "outdev" : "compare_out0", "iothread" : "iothread1"} } }
> *3. Start the secondary VM with command*
>         qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp
> stdio \
>   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
>   -netdev
> tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
>   -device rtl8139,id=e0,netdev=hn0 \
>   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
>   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
>   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
>   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
>   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
>   -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
> node-name=node1 \
>   -drive
> if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
> top-id=active-disk0,file.file.filename=active-disk.qcow2,\
> file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
> file.backing.backing=colo-disk0,node-name=node2 \
>   -incoming tcp:0:9998
> *4. As the document create rbd server and do migrate with qmp command*
> [image: image.png]
> *5. Kill the PVM and failover to SVM*
> [image: image.png]
> *6. Reboot the secondary VM, then we will get the error.*
> It is high possibility to occur this error.
>
> Therefore, we can solve the image problem by *xfs_repair*, then reboot the
> VM it will work.
> Command:
> xfs_repair -L /dev/sda1
>
> Do you have any idea to occur this problem?

Hi Daniel,
The disks have to be synchronized before you can start COLO. So try something like this:

{'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://SECONDARY:?/colo-disk0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} }

Then, after the job is ready:
{'execute': 'stop'}
{'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} }

And then you can add the replication driver and start colo.

Regards,
Lukas Straub

Hi Lukas, 
      It works well, thanks for your help.

Otherwise, could we change the secondary VM's replication driver to quorum driver 
to realize  continuously VM replication ?

Here is the start command.
Original :
qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp stdio \
   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
   -netdev  tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
   -device rtl8139,id=e0,netdev=hn0 \
   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
   -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\
 node-name=node1 \

   -drive if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\
 top-id=active-disk0,file.file.filename=active-disk.qcow2,\
 file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\
 file.backing.backing=colo-disk0,node-name=node2 \

   -incoming tcp:0:9998  
 
Modify :
  qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp stdio \
   -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \
   -netdev  tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \
   -device rtl8139,id=e0,netdev=hn0 \
   -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \
   -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \
   -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \
   -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \
   -object filter-rewriter,id=rew0,netdev=hn0,queue=all \
   -drive if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,\
children.0.file.filename=$image_path,children.0.driver=qcow2 \

   -incoming tcp:0:9998    

Best regard, 
Daniel Cho

reply via email to

[Prev in Thread] Current Thread [Next in Thread]