qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1


From: Fabian Grünbichler
Subject: Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1
Date: Thu, 19 Jan 2017 12:59:58 +0100
User-agent: NeoMutt/20161126 (1.7.1)

On Wed, Jan 18, 2017 at 05:30:17PM +0100, Paolo Bonzini wrote:
> 
> 
> On 18/01/2017 17:19, Fabian Grünbichler wrote:
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result: 
> > hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : 
> > Illegal Request [current]
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: 
> > Invalid field in cdb
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 
> > 00 0d d6 51 48 00 08 00 00
> > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, 
> > dev sda, sector 232149320
> > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): 
> > ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 
> > starting block 29018921)
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018409
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018410
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018411
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018412
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018413
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018414
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018415
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018416
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018417
> > Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical 
> > block 29018418
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result: 
> > hostbyte=DID_OK driverbyte=DRIVER_SENSE
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : 
> > Illegal Request [current]
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: 
> > Invalid field in cdb
> > Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 
> > 00 0d d6 59 48 00 08 00 00
> > Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, 
> > dev sda, sector 232151368
> > Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1): 
> > ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0 
> > starting block 29019177)
> > Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file 
> > data on sda1-8
> > Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file 
> > data on sda1-8
> > 
> > 
> > strace (with some random grep-ing):
> > [pid  1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, 
> > d6, 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, 
> > dxfer_len=1048576, timeout=4294967295, flags=0x1, 
> > data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]})
> >  = -1 EINVAL (Invalid argument)
> > [pid  1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, 
> > d6, 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, 
> > dxfer_len=1048576, timeout=4294967295, flags=0x1, 
> > data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]})
> >  = -1 EINVAL (Invalid argument)
> 
> This is useful, thanks.  I suspect blk_rq_map_user_iov is failing,
> meaning that the scatter/gather list has too many segments for the HBA
> in the host.  (The limit can be found in /sys/block/sda/queue/max_segments).

limit is 168 for all the disks I tested with.

> 
> This is consistent with your finding here:
> 
> > disabling THP on the hypervisor host with
> > 
> > # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
> > 
> > allows reproducing the bug very reliably, shutting the VM down, then
> > enabling THP (with 'always') and trying again makes it go away.
> 
> because no THP means more memory fragmentation and thus more segments.

it is also very easily reproducible with both THP enable and defrag set
to madvise or always, if tested in fragmented- or low-memory conditions.

my test host has 64G of memory, my test VM 4G, huge pages are 2k big

if I simulate some memory load by repeatedly reserving 50G memory using
stress-ng:

# stress-ng --vm 50 --vm-bytes=1G --vm-hang 30

and then start the test VM and the dd-ing, I can see the big chunk of
AnonHugePages allocated to the VM system grow:

# grep -E 'AnonHugePages:[[:space:]]+[0-9]{5,} kB' /proc/$(pidof 
qemu-system-x86_64)/smaps

up to about 3G (of 4G), and hit the issue.

without the additional load and fragmentation using stress-ng, the
AnonHugePages allocated to the qemu process grow to the expected 4G, and
the issue does not occur.

> 
> I'm not sure how to fix it, unfortunately. :(

so this means either use non-transparent huge pages when using
scsi-block (haven't verified but should work?), or use aggressive THP
settings and/or always leave enough memory reserves? :-/ this is very
unfortunate IMHO (and probably also not a very realistic usage
scenario?)

> 
> Paolo
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]