qemu-stable
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-stable] [Qemu-devel] Data corruption in Qemu 2.7.1


From: Fabian Grünbichler
Subject: Re: [Qemu-stable] [Qemu-devel] Data corruption in Qemu 2.7.1
Date: Wed, 18 Jan 2017 17:19:41 +0100
User-agent: NeoMutt/20161126 (1.7.1)

On Wed, Jan 18, 2017 at 12:50:50PM +0100, Fabian Grünbichler wrote:
> On 17/01/2017 16:03, Paolo Bonzini wrote:
> > On 17/01/2017 12:22, Fabian Grünbichler wrote:
> >> 6) repeat 3-5 until md5sum does not match, kernel spews error
> >> messages, or you are convinced that everything is OK
> >>
> >> sample kernel message (for ext3):
> >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 FAILED Result: 
> >> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Sense Key : 
> >> Illegal Request [current]
> >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 Add. Sense: 
> >> Invalid field in cdb
> >> Jan 17 11:39:32 ubuntu kernel: sd 2:0:0:0: [sda] tag#32 CDB: Write(10) 2a 
> >> 00 0f 3a 90 00 00 07 d8 00
> >> Jan 17 11:39:32 ubuntu kernel: blk_update_request: critical target error, 
> >> dev sda, sector 255496192
> > 
> > Can you reproduce it if QEMU runs under "strace -e ioctl -ff" in the 
> > host?  Or also using this systemtap script.
> > 
> > The important bit would be the lines with a nonzero status, but the
> > others can be useful to see what the surroundings look like.
> > 
> 
> OT: systemtap is not working with your script under Debian Jessie (or
> maybe in general under Debian Jessie? not sure).
> 
> after some further testing it seems like this change in Qemu exposes
> some subtle issue with our specific kernel (it works fine with the
> upstream Ubuntu 4.4 one which ours is based on). I am currently
> debugging further to narrow down potential causes - if I need further
> input from your side or if I suspect Qemu to be at fault I'll resurrect
> this thread (and include the strace output).
> 
> thanks for your quick reaction anyhow!
> 

okay, so this looks like either a bug in Qemu or the upstream kernel.

disabling THP on the hypervisor host with

# echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled

allows reproducing the bug very reliably, shutting the VM down, then
enabling THP (with 'always') and trying again makes it go away.

Qemu was compiled with:
../configure --with-confsuffix=/kvm --target-list=x86_64-softmmu
--disable-xen --enable-gnutls --enable-sdl --enable-uuid
--enable-linux-aio --enable-libiscsi --disable-smartcard
--audio-drv-list=alsa --enable-spice --enable-usb-redir --enable-libusb
--disable-gtk --enable-xfsctl --enable-numa --disable-strip
--enable-jemalloc --disable-libnfs --disable-fdt

attached is an strace with qemu master and mainline 4.9 running on
Debian Jessie - I will try to test it with Fedora or CentOS tomorrow.

journal in the VM says the following:

Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal 
Request [current]
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid 
field in cdb
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00 
0d d6 51 48 00 08 00 00
Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev 
sda, sector 232149320
Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: 
I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29018921)
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018409
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018410
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018411
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018412
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018413
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018414
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018415
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018416
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018417
Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block 
29018418
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: 
hostbyte=DID_OK driverbyte=DRIVER_SENSE
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal 
Request [current]
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid 
field in cdb
Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00 
0d d6 59 48 00 08 00 00
Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev 
sda, sector 232151368
Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): ext4_end_bio:329: 
I/O error -121 writing to inode 125 (offset 0 size 0 starting block 29019177)
Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file 
data on sda1-8
Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file 
data on sda1-8


strace (with some random grep-ing):
[pid  1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 
51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576, 
timeout=4294967295, flags=0x1, 
data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]})
 = -1 EINVAL (Invalid argument)
[pid  1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6, 
59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576, 
timeout=4294967295, flags=0x1, 
data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]})
 = -1 EINVAL (Invalid argument)

Attachment: host-strace.gz
Description: application/gzip


reply via email to

[Prev in Thread] Current Thread [Next in Thread]