qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] drive_del vs. device_del: what should come first?


From: Heinz Graalfs
Subject: Re: [Qemu-devel] drive_del vs. device_del: what should come first?
Date: Wed, 02 Apr 2014 16:25:09 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.4.0

On 01/04/14 17:48, Markus Armbruster wrote:
Heinz Graalfs <address@hidden> writes:

Hi Kevin,

doing a

      virsh detach-device ...

ends up in the following QEMU monitor commands:

1. device_del ...
2. drive_del ...

qmp_device_del() performs the device unplug path.
In case of a block device do_drive_del() tries to
prevent further IO against the host device.

However, bdrv_find() during drive_del() results in
an error, because the device is already gone. Due to
this error all the bdrv_xxx calls to quiesce the block
driver as well as all other processing is skipped.

Is the sequence that libvirt triggers OK?
Shouldn't drive_del be executed first?

No.

OK, I see. The drive is deleted implicitly (release_drive()).
Doing a device_del() requires another drive_add() AND device_add().
(Doing just a device_add() complains about the missing drive.
A subsequent info qtree lets QEMU abort.)


drive_del is nasty.  Its purpose is to revoke access to an image even
when the guest refuses to cooperate.  To the guest, this looks like
hardware failure.

Deleting a device during active IO is nasty and it should look like a
hardware failure. I would expect lots of errors.


If you drive_del before device_del, even a perfectly well-behaved guest
guest is exposed to a terminally broken device between drive_del and
completion of unplug.

The early drive_del() would mean that no further IO would be
possible.


Always try a device_del first, and only if that does not complete within
reasonable time, and you absolutely must revoke access to the image,
then whack it over the head with drive_del.

What is this reasonable time?

On 390 we experience problems (QEMU abort) when asynch block IO
completes and the virtqueues are already gone. I suppose the
bdrv_drain_all() in bdrv_close() is a little late. I don't see such
problems with an early bdrv_drain_all() (drive_del) and an unplug
(device_del) afterwards.






reply via email to

[Prev in Thread] Current Thread [Next in Thread]