qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] [RFC PATCH 0/4] Virtio command timeouts (qemu part)


From: Hannes Reinecke
Subject: [Qemu-devel] [RFC PATCH 0/4] Virtio command timeouts (qemu part)
Date: Fri, 12 May 2017 12:20:53 +0200

Hi all,

we've run into a really awkward customer situation where the guest would
hang forever due to an SG_IO ioctl on the host not returning.
Looking into it we found that qemu will submit direct I/O requests with
an _infinite_ timeout (well, actually UINT_MAX, which due to a kernel
bug gets translated into (ULONG)-2, resulting in a timeout of
4.2 years :-).
And this particular I/O ran into a timeout on the wire due to a flaky
connection. Which resulted in the 'normal' block-level timeout on the
host being disabled, and the SCSI stack never sending any aborts as
the block-layer was still waiting for the I/O timeout to expire.

Unfortunately I didn't find a way to create a stand-alone patch; the
fix I'm proposing relies on fixes for qemu running on the host and
the kernel side running on the guest.

The proposed fix consists of several parts:
- make the standard device-timeout user-settable via a 'timeout'
  attribute to 'scsi-disk' and 'scsi-generic'
- Add a kernel patch to implement a eh_timeout_handler() for
  virtio_scsi(); this patch just checks if the command is still pending
  and resets the timer if so.
- Add a request timeout to allow drivers to modify the timeout
  on a per-request base.
- Implement a new VIRTIO_SCSI_F_TIMEOUT feature allowing virtio-scsi
  to pass in a timeout via the otherwise unused 'crn' field.
- Add a kernel patch to implement the VIRTIO_SCSI_F_TIMEOUT feature
  so that the timeout is added per virtio request.

With that virtio-scsi on the guest can pass in the used timeout to the
qemu on the host side, which then can use this timeout to issue I/O
requests to the host.
The host can then properly aborting a command if the timeout is hit, and
the aborted command will be returned to the guest.
The guest itself doesn't need to (and, in fact, in most cases can't) abort
any commands anymore, so it just need to reset the I/O timer until the
requests are returned.

However, as this is quite an elaborate construct I'd like to get some
feedback for it.

Hannes Reinecke (4):
  scsi: make default command timeout user-settable
  scsi: use host default timeouts for SCSI commands
  scsi: per-request timeouts
  virtio: implement VIRTIO_SCSI_F_TIMEOUT feature

 hw/scsi/scsi-bus.c                           |  1 +
 hw/scsi/scsi-disk.c                          | 16 ++++++++++++----
 hw/scsi/scsi-generic.c                       | 11 +++++++++--
 hw/scsi/virtio-scsi.c                        | 16 ++++++++++++++++
 include/hw/scsi/scsi.h                       |  2 ++
 include/standard-headers/linux/virtio_scsi.h |  1 +
 6 files changed, 41 insertions(+), 6 deletions(-)

-- 
2.12.0




reply via email to

[Prev in Thread] Current Thread [Next in Thread]