qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block Filters


From: Fam Zheng
Subject: Re: [Qemu-devel] Block Filters
Date: Mon, 16 Sep 2013 15:41:45 +0800
User-agent: Mutt/1.5.21 (2010-09-15)

On Sun, 09/15 20:10, Benoît Canet wrote:
> Le Friday 06 Sep 2013 à 11:55:38 (+0200), Kevin Wolf a écrit :
> > Am 06.09.2013 um 11:18 hat Fam Zheng geschrieben:
> > > On Fri, 09/06 10:45, Kevin Wolf wrote:
> > > > Am 06.09.2013 um 09:56 hat Fam Zheng geschrieben:
> > > > > Since BlockDriver.bdrv_snapshot_create() is an optional operation, 
> > > > > blockdev.c
> > > > > can navigate down the tree from top node, until hitting some layer 
> > > > > where the op
> > > > > is implemented (the QCow2 bs), so we get rid of this 
> > > > > top_node_below_filter
> > > > > pointer.
> > > > 
> > > > Is it even inherent to a block driver (like a filter), if a snapshot is
> > > > to be taken at its level? Or is it rather a policy decision that should
> > > > be made by the user?
> > > > 
> > > OK, getting the point that user should have full flexibility and fine 
> > > operation
> > > granularity. It also stands against block_backend->top_node_below_filter. 
> > > Do we
> > > really have the assumption that all the filters are on top of the tree 
> > > and linear?
> > > Shouldn't this be possible?
> > > 
> > >                    Block Backend
> > >                          |
> > >                          |
> > >                     Quodrum BDS
> > >                     /    |    \
> > >              throttle filter  |     \
> > >                   /      |      \
> > >                 qcow2   qcow2   qcow2
> > > 
> > > So we throttle only a particular image, not the whole device. But this 
> > > will
> > > make a top_node_below_filter pointer impossible.
> > 
> > I was assuming that Benoît's model works for the special case of
> > snapshotting in one predefined way, but this is actually a very good
> > example of why it doesn't.
> > 
> > The approach relies on snapshotting siblings together, and in this case
> > the siblings would be throttle/qcow2/qcow2, while throttle is still a 
> > filter. This
> > would mean that either throttle needs to be top_node_below_filter and
> > throttling doesn't stay on top, or the left qcow2 is
> > top_node_below_filter and the other Quorum images aren't snapshotted.
> > 
> > > > In our example, the quorum driver, it's not at all clear to me that you
> > > > want to snapshot all children. In order to roll back to a previous
> > > > state, one snapshot is enough, you don't need multiple copies of the
> > > > same one. Perhaps you want two so that we can still compare them for
> > > > verification. Or all of them because you can afford the disk space and
> > > > want ultimate safety. I don't think qemu can know which one is true.
> > > > 
> > > Only if quorum ever knows about and operates on snapshots, it should be
> > > considered specifically, but no. So we need to achieve this in the general
> > > design: allow user to take snapshot, or set throttle limits on particular
> > > BDSes, as above graph.
> > > 
> > > > In the same way, in a typical case you may want to keep I/O throttling
> > > > for the whole drive, including the new snapshot. But what if the
> > > > throttling was used in order to not overload the network where the image
> > > > is stored, and you're now doing a local snapshot, to which you want to
> > > > stream the image? The I/O throttling should apply only to the backing
> > > > file, not the new snapshot.
> > > > 
> > > Yes, and OTOH, throttling really suits to be a filter only if it can be a 
> > > non
> > > top one, otherwise it's no better than what we have now.
> > 
> > Well, it would be a cleaner architecture in any case, but having it in
> > the middle of the stack feels useful indeed, so we should support it.
> > 
> > > > So perhaps what we really need is a more flexible snapshot/BDS tree
> > > > manipulation command that describes in detail which structure you want
> > > > to have in the end.
> > 
> > Designing the corresponding QMP command is the hard part, I guess.
> 
> During my vacation I though about the fact that JSON is pretty good to build a
> tree.
> 
> QMP, HMP and the command line could take a "block-tree" argument which would
> look like the following.
> 
> block-tree = { 'quorum': [
>                               {
>                                   'throttle' : {
>                                       'qcow2' : { 'filename': "img1.qcow2" }
>                                       'snapshotable': true,

What's the 'snapshotable' for?

>                                   },
>                                   'throttle-iops' : 150,
>                                   'throttle-iops-max' : 1000,
>                               },
>                               {
>                                   'qcow2' : { 'filename': "img2.qcow2" },
>                                   'snapshotable': true,
>                               },
>                               {
>                                   'qcow2' : { 'filename': "img3.qcow2" }
>                                   'snapshotable': false,
>                               }
>                           ] };
> 

It's not very clear to me. Does this mean a key associated with a dict value
means creating type? What do you put in the inner dict (i.e. why filename here)
and what to put outter besides the key (i.e. snapshotable)? Where to put 'id'?

I think JSON is flexible enough to specify anything we can take, but the format
needs to be designed carefully. And do we really want to use JSON in the
command line options? Very hard to imagine that. :)

Thanks,

Fam

> This would be passed to QEMU in a compact form without carriage return and
> spaces.
> 
> The block layer would convert this to C structs like the QMP code would do 
> for a
> QMP command and the bs tree would be recursively build from top to bottom by
> the Block Backend and each Block driver in the path using the C structs.
> 
> Each level would instanciate the lower level until a raw or protocol driver is
> reached.
> 
> What about this ?
> 
> Best regards
> 
> Benoît
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]