[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1 |
Date: |
Wed, 18 Jan 2017 17:30:17 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 |
On 18/01/2017 17:19, Fabian Grünbichler wrote:
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Sense Key : Illegal
> Request [current]
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 Add. Sense: Invalid
> field in cdb
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#109 CDB: Write(10) 2a 00
> 0d d6 51 48 00 08 00 00
> Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev
> sda, sector 232149320
> Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1):
> ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0
> starting block 29018921)
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018409
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018410
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018411
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018412
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018413
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018414
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018415
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018416
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018417
> Jan 18 17:07:51 ubuntu kernel: Buffer I/O error on device sda1, logical block
> 29018418
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 FAILED Result:
> hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Sense Key : Illegal
> Request [current]
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 Add. Sense: Invalid
> field in cdb
> Jan 18 17:07:51 ubuntu kernel: sd 2:0:0:0: [sda] tag#106 CDB: Write(10) 2a 00
> 0d d6 59 48 00 08 00 00
> Jan 18 17:07:51 ubuntu kernel: blk_update_request: critical target error, dev
> sda, sector 232151368
> Jan 18 17:07:51 ubuntu kernel: EXT4-fs warning (device sda1):
> ext4_end_bio:329: I/O error -121 writing to inode 125 (offset 0 size 0
> starting block 29019177)
> Jan 18 17:07:52 ubuntu kernel: JBD2: Detected IO errors while flushing file
> data on sda1-8
> Jan 18 17:07:58 ubuntu kernel: JBD2: Detected IO errors while flushing file
> data on sda1-8
>
>
> strace (with some random grep-ing):
> [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6,
> 51, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=17, dxfer_len=1048576,
> timeout=4294967295, flags=0x1,
> data[1048576]=["\0`\235=c\177\0\0\0\0\1\0\0\0\0\0\0`\236=c\177\0\0\0\0\1\0\0\0\0\0"...]})
> = -1 EINVAL (Invalid argument)
> [pid 1794] ioctl(19, SG_IO, {'S', SG_DXFER_TO_DEV, cmd[10]=[2a, 00, 0d, d6,
> 59, 48, 00, 08, 00, 00], mx_sb_len=252, iovec_count=16, dxfer_len=1048576,
> timeout=4294967295, flags=0x1,
> data[1048576]=["\0`-=c\177\0\0\0\0\1\0\0\0\0\0\0`.=c\177\0\0\0\0\1\0\0\0\0\0"...]})
> = -1 EINVAL (Invalid argument)
This is useful, thanks. I suspect blk_rq_map_user_iov is failing,
meaning that the scatter/gather list has too many segments for the HBA
in the host. (The limit can be found in /sys/block/sda/queue/max_segments).
This is consistent with your finding here:
> disabling THP on the hypervisor host with
>
> # echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
>
> allows reproducing the bug very reliably, shutting the VM down, then
> enabling THP (with 'always') and trying again makes it go away.
because no THP means more memory fragmentation and thus more segments.
I'm not sure how to fix it, unfortunately. :(
Paolo
- Re: [Qemu-devel] Data corruption in Qemu 2.7.1, (continued)
- Re: [Qemu-devel] Data corruption in Qemu 2.7.1, Fam Zheng, 2017/01/17
- Re: [Qemu-devel] Data corruption in Qemu 2.7.1, Alexandre DERUMIER, 2017/01/17
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Fabian Grünbichler, 2017/01/17
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Paolo Bonzini, 2017/01/17
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Fabian Grünbichler, 2017/01/17
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Paolo Bonzini, 2017/01/17
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Paolo Bonzini, 2017/01/17
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Fabian Grünbichler, 2017/01/18
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Fabian Grünbichler, 2017/01/18
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1,
Paolo Bonzini <=
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Fabian Grünbichler, 2017/01/18
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Fabian Grünbichler, 2017/01/19
- Re: [Qemu-devel] [Qemu-stable] Data corruption in Qemu 2.7.1, Paolo Bonzini, 2017/01/24