[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name
From: |
Markus Armbruster |
Subject: |
Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name |
Date: |
Thu, 04 Dec 2014 16:56:18 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) |
Kevin Wolf <address@hidden> writes:
> [ CCed Benoît and Max, this is blockdev work ]
> [ CCed Jeff, we're also talking about op blockers ]
>
> Not stripping quoted text for their convenience.
>
> Am 02.12.2014 um 20:06 hat Markus Armbruster geschrieben:
>> = Introduction =
>>
>> The block layer and its monitor commands have evolved, resulting in a
>> bit of a mess when it comes to referring to block layer objects in
>> commands.
>>
>> Semantically, we refer to two different kinds of objects: backends and
>> nodes within backends.
>>
>> Example: eject parameter @device names a backend.
>>
>> Example: change-backing-file parameter @image-node-name names a node.
>>
>> Backend and node names share a name space. Names are unique.
>> Corollary: a name unambiguously identifies either a backend, or a node,
>> or nothing.
>>
>> Example: blockdev-add parameter @options is a union with discriminator
>> "driver". With its value is "raw", member "file" is an anonymous union.
>> Its string value can name either a backend or a node.
>>
>> Node names are a fairly recent feature. Before, we referenced nodes by
>> their file name. Pretty schlocky. Should be replaced by node name.
>>
>> Example: block-commit parameter @base is the "file name of the backing
>> image to write data into". In other words, it identifies a node. We
>> should add a node name parameter, and deprecate @base.
>>
>> Some commands predating the node name feature can reference only
>> backends even though nodes could make sense, too. Such restrictions
>> should be lifted. Others reference backends where only nodes make
>> sense. Should be cleaned up.
>>
>> Example: drive-mirror parameter @device names a backend. This restricts
>> mirroring to backends. If we want to support mirroring nodes, we need
>> to extend the command to permit referencing nodes.
>>
>> Example: block_passwd parameter @device names a backend. However,
>> backends aren't encryped, nodes are. In 2.0, we cleaned this up: we
>> added parameter @node-name. We kept @device for backward compatibility.
>> Either @device or @node-name must be given.
>>
>> Note: in block_passwd, we have separate parameters for backend and node
>> name. In the blockdev-add example above, we have only one, and use its
>> value to figure out what it is.
>>
>> I find this inconsistency rather ugly. We discussed cleaning it up in
>> the "BB name vs. BDS name in the QAPI schema" thread.
>>
>> A backend has a tree of nodes. We call the tree's root the backend's
>> root node.
>>
>> For convenience, we sometimes accept a backend name when a node name is
>> wanted, and resolve it to the backend's root node.
>>
>> Example: block_passwd again; it's how its @device parameter works.
>>
>> Recent development (blockdev-add) permits nodes to be shared by multiple
>> backends. Op blockers should ensure the sharing is safe. I wouldn't
>> bet a nickel on this to work today.
>>
>>
>> = Block layer data structures =
>>
>> A backend is represented by a BlockBackend object (short: BB).
>>
>> A node is represented by a BlockDriverState object (short: BDS).
>>
>> A backend (BB) has a tree of nodes (BDSes).
>>
>> Two of a nodes children carry special meaning: the one stored in BDS
>> member file (sometimes called "parent"), and the one stored in BDS
>> member backing_hd (sometimes called "backing"). These used to be the
>> only children a node could have.
>>
>> A BB always has a name. We sometimes call it device name. Stored in BB
>> member name.
>>
>> A BDS may have a name. We call it node name. Stored in BDS member
>> node_name[], empty when the node has no name.
>>
>> A BDS has a file name. The actual matching of a command's file name
>> argument against a BDS's file name is complex, but we don't have to
>> worry about that here.
>>
>> BBs are a fairly recent invention. Before, a backend was represented by
>> a BDS with a non-empty member device_name[].
>>
>>
>> = Commands =
>>
>> I'm going to discuss all QMP commands whose handlers are defined in
>> block*. To make sense of it, you'll probably have to peruse the command
>> documentation in the schema and/or qmp-commands.hx.
>>
>> When I think it's fairly obvious what needs to be done for a command, I
>> write it down as TODO. Please challenge it if you think I'm wrong.
>>
>> When it's not so obvious, I write it down as questions. Answers
>> appreciated.
>>
>> == block-core.json ==
>>
>> * block-commit
>>
>> @device names a backend, @top and @base each name one of its nodes via
>> file name matching.
>>
>> TODO: support specifying the two nodes via node name.
>>
>> Why do we need @device when a top node is specified?
>>
>> * We currently need the backend to set op blockers, and finding a
>> node's backend isn't easy. Implementation detail shows through the
>> interface, blech.
>>
>> * We need to know whether the top node is the active layer.
>>
>> Complication: if it's shared by multiple backends, it may be the
>> active layer in one but not the other. Specifying the backend
>> resolves the ambiguity. Whether that makes sense I don't know.
>
> It doesn't.
>
> The real requirement for the commit job (for inactive layers) is that
> base...top (I'm using git-rev-parse syntax for describing nodes and
> subtrees from here on) is read-only for the duration of the commit
> operation. For committing the active layer, we use a mirror job
> internally, which works for images that are written during the
> operation.
>
> In theory, the mirror should work in all cases, it's just less efficient
> (and the implementation of the completion code which reconfigures the
> tree would have to be changed). Jeff, is this correct?
>
> What commit should be doing:
> - Check whether it can block (as in op blockers) writes to top, i.e. no
> other user is currently requiring the ability to write
> - If it can, block writes on base...top and start a commit job
> - If it can't, block writes on base...top^ and start a mirror job
> - If it can't block at least base...top^, fail
>
> This automatically solves the problem of multiple parents, as long as
> these parents advertise correctly which op blocker capabilities they are
> using (no, they don't, and we don't have the infrastructure for them to
> do so yet - but I think it's becoming clearer what it would have to look
> like).
Makes sense.
>> * block-job-cancel
>> * block-job-complete
>> * block-job-pause
>> * block-job-resume
>> * block-job-set-speed
>>
>> @device names a backend.
>>
>> The question whether we need to support a node name here is moot,
>> because jobs shouldn't be tied to a single backend (or node) in the
>> first place. They should have their own, independent job name.
>
> Correct.
>
> TODO: add a @name argument to commands starting block jobs
> TODO: add a @job-name argument to these job management commands and
> deprecate @device
>
> How to get rid of bs->job to actually allow multiple jobs on one BDS and
> make the job IDs useful is a separate question that is harder to answer.
Good point: we can fix the interface before we fix the implementation.
>> * block-stream
>>
>> @device names a backend, @base names a node via file name matching.
>>
>> TODO: support specifying the node via node name.
>>
>> I think backend is inappropriate here, we should support streaming up
>> to a node, just like block-commit supports committing down from a
>> node.
>
> Yes. I think Benoît suggested this before. I don't remember if he already
> coded up something.
>
>> * block_passwd
>> * block_resize
>>
>> @node-name names a node.
>>
>> @device names a backend, and references its root node, for
>> compatibility.
>>
>> Either @device or @node-name must be given.
>>
>> TODO: should have single mandatory parameter instead of two optionals.
>
> How is this "fairly obvious what needs to be done"? We can't get rid of
> any field because of compatibility. Do you mean this:
>
> TODO: Make @device accept node names as well
Obviousness is subjective.
I like to think about a nice, clean interface first, and worry about
compatibility second. Compatibility may force us to compromise on
cleanliness. I feel this helps finding the least unclean interface.
Furthermore, I like to deprecate unwanted interfaces. Advertize the
clean core, not the compatibility gunk.
Let's try this approach here.
I think we can agree that the clean interface is "single mandatory
parameter". Ironically, that's what we had until we screwed it up in
2.0.
@device is a sub-optimal name for this single parameter. Either we
accept that and move on, or we deprecate it in favor of a new parameter
with a better name. I guess the better name isn't worth that much
trouble, in particular since the command name is already ugly.
When @node-name is given, @device must not be given. So @device is
mandatory *except* in the @node-name usage we retain for compatibility.
Deprecate the usage.
Command documentation could then look like this:
##
# @block-resize
#
# Resize a block image while a guest is running.
#
# @device: the name of the block backend or node to resize (node names
# supported since 2.3)
#
# @size: new image size in bytes
#
# Deprecated usage (since 2.3):
#
# @device: #optional the name of the block backend to resize
#
# @node-name: #optional name of the node to resize (since 2.0)
#
# Either @device or @node-name must be set but not both.
#
# Since: 0.14.0
>> * blockdev-snapshot-sync
>>
>> @node-name and @device as for block_passwd, including TODO.
>>
>> @snapshot-node-name defines the new node's node name.
>>
>> * block_set_io_throttle
>>
>> @device names a backend.
>>
>> TODO: support nodes.
>>
>> Note: we'd like to redo throttling as a block filter node, so maybe
>> we'll have a completely different command instead.
>>
>> * blockdev-add
>>
>> This command defines a backend and its node tree, where sub-trees may
>> be references to existing nodes.
>>
>> @id defines the backend's name.
>>
>> @node-name defines its root node's node name.
>>
>> Note: blockdev-add always defines a backend. When you build up a
>> backend bottom-up with several commands, you get a bunch of unwanted
>> backends "in the middle". I'd like to make @id optional, and omit the
>> backend when it's missing.
>
> Yes, this makes sense, as we discussed privately a while ago.
>
>> * change-backing-file
>>
>> @device names a backend, @image-node-name names a node.
>>
>> As far as I can tell, we need the backend only to set op blockers.
>> Implementation detail shows through the interface.
>
> Once we have the new op blockers, we'll make @device optional then and
> completely ignore its value.
>
>> * drive-backup
>>
>> @device names a backend.
>>
>> Do we want to be able to back up any node, or only a backend?
>>
>> Note: documentation of @target sounds like it could somehow name a
>> backend, but as far as I can tell it's always interpreted as file
>> name.
>
> I can't think of a reasonable use case for backing up non-roots because
> the only thing that would write to them are other block jobs. So you
> could backup the old snapshot contents while committing to it.
> Usefulness questionable.
>
> That said, while there's no urgent need for it, it would be nicer to
> allow all operations that can safely be performed, and this is one of
> them. Might fall out almost natually once we have op blockers, so that
> we really only need to add a @node-name option.
>
> But that's something for later.
>
>> * drive-mirror
>>
>> @device names a backend, @replaces names a node, and @node-name
>> defines the name of the new node.
>>
>> Do we want to be able to mirror any node, or only a backend?
>>
>> Note: documentation of @target sounds like it could somehow name a
>> backend, but as far as I can tell it's always interpreted as file
>> name.
>>
>> Note: drive-mirror supports @replaces, but drive-backup doesn't. Odd.
>
> We probably want to mirror any node (Benoît has a use case for this with
> replacing broken quorum children).
>
>> * query-block
>>
>> Returns information on all backends as array of BlockInfo. BlockInfo
>> member @device is the backend name.
>>
>> * query-named-block-nodes
>>
>> Returns information on all named nodes as an array of BlockDeviceInfo
>> (we suck at naming). BlockDeviceInfo member @node-name is the node
>> name.
>>
>> You can't get information on nodes that don't have a node name.
>>
>> * query-blockstats
>>
>> Returns some stats for all backends as array of BlockStats.
>> BlockStats member @device is the backend name. Member @stats contains
>> the stats for the backend's root node. Member @parent is BlockStats
>> for the child node that is stored in BDS member file. Member @backing
>> is BlockStats for the chold node that is stored in BDS member
>> backing_hd. Stats for other children aren't returned.
>>
>> TODO: generalize this to the full tree, include node names.
>
> Do we want the same thing for query-block?
There are a couple of ways to skin the query-cat.
Let's get the easy case out of the way first: a query that reports only
backend properties and not node properties can return an array where
each element carries a backend name, like query-block does now.
For queries that report node properties, we have a couple of options:
* Flat array with node names
Like current query-named-block-nodes.
Can't report on nodes without names. Jeff's project to give all nodes
names would moot this issue.
* Array of trees mirroring the actual node forest,
Similar to current query-blockstats.
Tree roots correspond to backends, and backends have names.
Unfortunately, the nodes don't actually form a forest: node trees may
be shared. To turn it into make a forest, we'd have to duplicate the
shared trees.
* Hybrid: array of trees, but named sub-trees are represented by name
Like the previous one, except the recursion stops at named nodes.
Instead of including such a sub-tree, we reference it by name, then
add it to the array if it's not already there.
"Flat array" seems simplest.
>> * query-block-jobs
>>
>> Returns information on block jobs as array of BlockJobInfo. A block
>> job is always tied to a backend, and BlockJobInfo member @device is
>> its name.
>>
>> The question whether we need a node name here is moot; see
>> block-job-cancel above.
>>
>> == block.json ==
>>
>> * blockdev-snapshot-delete-internal-sync
>> * blockdev-snapshot-internal-sync
>>
>> @device names a backend.
>>
>> Do we want to be able to snapshot any node, or only a backend?
>
> Probably only a backend. At least until someone comes up with a
> reasonable use case. Internal snapshots and backing files don't mix very
> well anyway.
>
>> * eject
>>
>> @device names a backend. Appropriate, because this is really a
>> backend operation.
>>
>> * nbd-server-add
>>
>> @device names a backend.
>>
>> I think backend is appropriate here. The NBD server sits on top of
>> the block layer just like device models do. It should therefore use
>> the BB API. See Max's [PATCH v2 0/6] nbd: Use BlockBackend.
>
> Agreed.
>
>> * nbd-server-start
>> * nbd-server-stop
>>
>> No backend or node name parameters.
>>
>> == qapi-schema.json ==
>>
>> * transaction
>>
>> This is a wrapper around a list of transaction-capable commands.
>> Thus, nothing new here.
>
> Surprisingly few really hard questions.
Good :)
- [Qemu-devel] Review of monitor commands identifying BDS / BB by name, Markus Armbruster, 2014/12/02
- Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name, Fam Zheng, 2014/12/03
- Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name, Kevin Wolf, 2014/12/03
- Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name,
Markus Armbruster <=
- Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name, Markus Armbruster, 2014/12/05
- Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name, Kevin Wolf, 2014/12/05
- Re: [Qemu-devel] Review of monitor commands identifying BDS / BB by name, Markus Armbruster, 2014/12/05
[Qemu-devel] Can we make monitor commands identify BDS / BB by name consistently? (was: Review of monitor commands identifying BDS / BB by name), Markus Armbruster, 2014/12/16