Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang

From:	Kevin Wolf
Subject:	Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
Date:	Mon, 28 Sep 2020 12:57:11 +0200

Am 27.09.2020 um 15:04 hat Ying Fang geschrieben:
> A VM in the cloud environment may use a virutal disk as the backend storage,
> and there are usually filesystems on the virtual block device. When backend
> storage is temporarily down, any I/O issued to the virtual block device will
> cause an error. For example, an error occurred in ext4 filesystem would make
> the filesystem readonly. However a cloud backend storage can be soon 
> recovered.
> For example, an IP-SAN may be down due to network failure and will be online
> soon after network is recovered. The error in the filesystem may not be
> recovered unless a device reattach or system restart. So an I/O rehandle is
> in need to implement a self-healing mechanism.
> 
> This patch series propose a feature called I/O hang. It can rehandle AIOs
> with EIO error without sending error back to guest. From guest's perspective
> of view it is just like an IO is hanging and not returned. Guest can get
> back running smoothly when I/O is recovred with this feature enabled.

What is the problem with setting werror=stop and rerror=stop for the
device? Is it that QEMU won't automatically retry, but management tool
interaction is required to resume the guest?

I haven't checked your patches in detail yet, but implementing this
functionality in the backend means that blk_drain() will hang (or if it
doesn't hang, it doesn't do what it's supposed to do), making the whole
QEMU process unresponsive until the I/O succeeds again. Amongst others,
this would make it impossible to migrate away from a host with storage
problems.

Kevin

[Prev in Thread]

Current Thread

[Next in Thread]

[RFC PATCH 6/7] qemu-option: add I/O hang timeout option, (continued)
- [RFC PATCH 6/7] qemu-option: add I/O hang timeout option, Ying Fang, 2020/09/27
- [RFC PATCH 7/7] qapi: add I/O hang and I/O hang timeout qapi event, Ying Fang, 2020/09/27
- [RFC PATCH 1/7] block-backend: introduce I/O rehandle info, Ying Fang, 2020/09/27
- [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale, Ying Fang, 2020/09/27
  - Re: [RFC PATCH 4/7] block-backend: add I/O hang drain when disbale, Eric Blake, 2020/09/28
- [RFC PATCH 2/7] block-backend: rehandle block aios when EIO, Ying Fang, 2020/09/27
- [RFC PATCH 3/7] block-backend: add I/O hang timeout, Ying Fang, 2020/09/27
- [RFC PATCH 5/7] virtio-blk: disable I/O hang when resetting, Ying Fang, 2020/09/27
- Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang, no-reply, 2020/09/27
- Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang, no-reply, 2020/09/27
- Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang, Kevin Wolf <=
  - Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang, cenjiahui, 2020/09/29

Prev by Date: Re: [PATCH v6] nvme: allow cmb and pmr emulation on same device
Next by Date: Re: [PATCH 3/3] hw/qdev-clock: Display error hint when clock is missing from device
Previous by thread: Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
Next by thread: Re: [RFC PATCH 0/7] block-backend: Introduce I/O hang
Index(es):
- Date
- Thread