Thanks for collecting the data! The fact that both virtio-blk and virtio-scsi failed suggests it's not a virtqueue element leak in the virtio-blk or virtio-scsi device emulation code. The hung task error
messages from inside the guest are a consequence of QEMU hitting the "Virtqueue size exceeded" error. QEMU refuses to process further requests after the error, causing tasks inside the guest to get stuck on I/O. I don't have a good theory regarding the root
cause. Two ideas: 1. The guest is corrupting the vring or submitting more requests than will fit into the ring. Somewhat unlikely because it happens with both Windows and Linux guests. 2. QEMU's virtqueue code is buggy, maybe the memory region cache which
is used for fast guest RAM accesses. Here is an expanded version of the debug patch which might help identify which of these scenarios is likely. Sorry, it requires running the guest again! This time let's make QEMU dump core so both QEMU state and guest RAM
are captured for further debugging. That way it will be possible to extract more information using gdb without rerunning. Stefan --- diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index a1ff647a66..28d89fcbcb 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c
@@ -866,6 +866,7 @@ void *virtqueue_pop(VirtQueue *vq, size_t sz) return NULL; } rcu_read_lock(); + uint16_t old_shadow_avail_idx = vq->shadow_avail_idx; if (virtio_queue_empty_rcu(vq)) { goto done; } @@ -879,6 +880,12 @@ void *virtqueue_pop(VirtQueue *vq,
size_t sz) max = vq->vring.num; if (vq->inuse >= vq->vring.num) { + fprintf(stderr, "vdev %p (\"%s\")\n", vdev, vdev->name); + fprintf(stderr, "vq %p (idx %u)\n", vq, (unsigned int)(vq - vdev->vq)); + fprintf(stderr, "inuse %u vring.num %u\n", vq->inuse, vq->vring.num);
+ fprintf(stderr, "old_shadow_avail_idx %u last_avail_idx %u avail_idx %u\n", old_shadow_avail_idx, vq->last_avail_idx, vq->shadow_avail_idx); + fprintf(stderr, "avail %#" HWADDR_PRIx " avail_idx (cache bypassed) %u\n", vq->vring.avail, virtio_lduw_phys(vdev,
vq->vring.avail + offsetof(VRingAvail, idx))); + fprintf(stderr, "used_idx %u\n", vq->used_idx); + abort(); /* <--- core dump! */ virtio_error(vdev, "Virtqueue size exceeded"); goto done; }