qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] BlockDriverState member in_use (was: KVM Forum 2012 Block B


From: Markus Armbruster
Subject: [Qemu-devel] BlockDriverState member in_use (was: KVM Forum 2012 Block BoF minutes)
Date: Mon, 19 Nov 2012 18:03:55 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.1 (gnu/linux)

Marcelo Tosatti <address@hidden> writes:

> On Thu, Nov 15, 2012 at 02:31:53PM +0100, Markus Armbruster wrote:
[...]
>> BlockDriverState member in_use and DriveInfo member refcount are a mess:
>> 
>> - in_use is used to avoid running certain things concurrently, but the
>>   rules are unclear, except they're clearly incomplete; possible rules
>>   could be about need for consistent views of image contents
>
>     block: enable in_use flag
>
>     Set block device in use during block migration, disallow drive_del and
>     bdrv_truncate for in use devices.
>
> 1) drive_del is not allowed because memory object used by block
> streaming/migration can disappear.
> 2) bdrv_truncate changes device size, which the block migration code was
> unable to handle at the time.
>
> These are the rules (accordingly to the changeset).

Yup.  Your commits db593f25^..8591675f.

>                                                     IIRC new rules have
> been added (new uses for bdrv_in_use).

Stefan's commit 2d3735d3.

Let's have a look at the users of in_use (from notes I took back in
August, possibly stale now):

* bdrv_commit()

  Since commit 2d3735d3:

    block: check bdrv_in_use() before blockdev operations
    
    Long-running block operations like block migration and image streaming
    must have continual access to their block device.  It is not safe to
    perform operations like hotplug, eject, change, resize, commit, or
    external snapshot while a long-running operation is in progress.
    
    This patch adds the missing bdrv_in_use() checks so that block migration
    and image streaming never have the rug pulled out from underneath
    them.

  Users are monitor command commit and terminal escape 's', directly and
  via bdrv_commit_all()

  What exactly would go wrong if someone commited during block migration
  / image streaming?

  Strange: also tests whether backing_hd is in_use.  The other users
  don't, and I can't see how the test could ever succeed.

  Cock-up: bdrv_commit_all() stops on first error.

* bdrv_truncate()

  Since commit 8591675f:
  
    block: enable in_use flag
    
    Set block device in use during block migration, disallow drive_del and
    bdrv_truncate for in use devices.

  Users are monitor command block_resize and a bunch of block drivers.

  I don't think block drivers are affected by in_use.  They resize their
  internal BDSs such as bs->file, and I can't see how in_use could be
  set there.

  "No resize while block migration is running" is (was?) an artificial
  restriction; it should migrate the resize just like any other change.

  Image streaming, too, I guess.

* block_job_create()

  Since commit eeec61f2:

    block: add BlockJob interface for long-running operations

  Use is monitor command block-stream.

  Why does the block job infrastructure set in_use?  I suspect just
  because its only user "image streaming" needs it, but future users
  might not.  If that's correct, it's set in the wrong place.

* qmp_transaction()

  Since commit 2d3735d3 (see above).  Function was called
  qmp_blockdev_snapshot_sync() then.

  User is monitor command snapshot_blkdev.

  What exactly would go wrong if someone snaphotted during block
  migration / image streaming?

* eject_device()

  Since commit 2d3735d3 (see above).

  Users are monitor command eject, change.

  What would go wrong is obvious, for a change: we can't just kill the
  BDS while a long-running operation is using it.

  Could we just disassociate the BDS from the drive without killing it?
  So that when the job completes, the BDS's reference count drops to
  zero, and the BDS gets destroyed.

* drive_del

  Since commit 8591675f (see above).

  In theory, in_use is unnecessary here: reference counting should delay
  the delete just fine.  In practice, do_drive_del() *ignores* the
  reference count.  Even if that was fixed, the count only delays
  drive_uninit(), not bdrv_drain_all()..bdrv_close().  Another fine
  mess.

  Note how we treat the device model's reference to the drive: if it
  exists, we make the drive anonymous instead of destroying it.  It gets
  destroyed only when the device model drops the reference.  Similar to
  reference counting.

Tests of in_use are spread over block.c and monitor commands.  Smells
bad.  If it was only about excluding monitor commands during certain
long-running operations, the montor commands would be the appropriate
place.  Is it?

> "Consistent views of image contents" is not defined.

Imprecise language for "in_use seems to be about changes to the image,
while refcount is for keeping objects alive".  We were trying to make
sense of the mess.

> Question: why are the rules "clearly incomplete" ?

Are we sure we catch everything that could interfere with block
migration / image streaming?  Why only these two?  What about other
long-running jobs?  Do they need interference protection, too?

Shouldn't there be clear rules on what changes can happen while a job
runs, and which ones can be prevented, and how?

>> - refcount is only used together with in_use, and appears to be for
>>   avoiding premature deletion
>
> Refcount is used to stop block auto deletion when device is unplugged
> by the guest. If you can reach BlockDriverState and use its in_use
> flag, then you can kill refcount.

Except it'll come right back when we switch to QOM :)



reply via email to

[Prev in Thread] Current Thread [Next in Thread]