qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-block] KVM Forum block no[td]es


From: Max Reitz
Subject: Re: [Qemu-block] KVM Forum block no[td]es
Date: Fri, 16 Nov 2018 16:19:30 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0

On 16.11.18 16:03, Alberto Garcia wrote:
> On Fri 16 Nov 2018 01:14:37 PM CET, Max Reitz wrote:
>>>>>> Permission system
>>>>>> =================
>>>>>>
>>>>>> GRAPH_MOD
>>>>>> ---------
>>>>>>
>>>>>> We need some way for the commit job to prevent graph changes on its
>>>>>> chain while it is running.  Our current blocker doesn’t do the job,
>>>>>> however.  What to do?
>>>>>>
>>>>>> - We have no idea how to make a *permission* work.  Maybe the biggest
>>>>>>   problem is that it just doesn’t work as a permission, because the
>>>>>>   commit job doesn’t own the BdrvChildren that would need to be
>>>>>>   blocked (namely the @backing BdrvChild).
>>>>>
>>>>> What I'm doing at the moment in my blockdev-reopen series is check
>>>>> whether all parents of the node I want to change share the GRAPH_MOD
>>>>> permission. If any of them doesn't then the operation fails.
>>>>>
>>>>> This can be checked by calling bdrv_get_cumulative_perm() or simply
>>>>> bdrv_check_update_perm(..., BLK_PERM_GRAPH_MOD, ...).
>>>>
>>>> Sure.
>>>>
>>>>> It solves the problem because e.g. the stream block job only allows
>>>>> BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED, so no graph
>>>>> modifications allowed.
>>>>
>>>> But the problem is that the commit job needs to unshare the permission
>>>> on all backing links between base and top, so none of the backing
>>>> links can be broken.  But it cannot do that because those backing
>>>> links do not belong to it (but to the respective overlay nodes).
>>>
>>> I'm not sure if I'm following you. The commit block job has references
>>> to all intermediate nodes (with block_job_add_bdrv()) and only shares
>>> BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE. Anyone trying to obtain
>>> BLK_PERM_GRAPH_MOD on an intermediate node would see that the block job
>>> doesn't allow it.
>>
>> But first of all, why would anyone try to obtain it?  You only obtain
>> a permission when you attach a new parent (or when an existing parent
>> tries to change the ones it has).
> 
> You don't need to obtain it, you only need to see that the permission is
> shared (with bdrv_get_cumulative_perm()). Or, as you suggest if I got it
> right, attach a new "reopen" parent during bdrv_reopen_prepare() and
> release it on commit / abort.

OK, but that's just completely different from how all other permissions
work.

>> For instance, in the commit case, if you just remove a backing link,
>> then the chain is clearly broken and the graph has been modified, but
>> noone will have taken any permission to do so.
> 
> If you do what I said in my previous paragraph then you can't remove the
> backing link (this is one of my test cases).
> 
>> But even beyond that, commit doesn't care about all of the graph.  All
>> it cares about are the backing links.  So it should not prevent anyone
>> else from changing any links outside of the backing chain (which a
>> permission would clearly do).
>>
>> Let me give a concrete example:
>>
>> top (T) -> intermediate (I) -> base (B)
>>
>> Now you commit from top to base.  Clearly you don't want the backing
>> chain between top and base to change.  So say you unshare the GRAPH_MOD
>> permission (implying that whenever a parent doesn't care, it just shares
>> it).  But as said above, if someone just drops the backing link from I
>> to B, the permission system won't catch that.
> 
> Yes it will, because the block job (either mirror or commit depending on
> the case) is the parent of all three nodes (BlockJob.nodes is where the
> list is stored) and does not alow GRAPH_MOD in any of them[*]. So
> dropping that backing link fails (I also have a test case for that).
> 
>> The next thing is that how would you even try to catch it in the first
>> place?  The normal case would be for everything to both share and
>> claim the GRAPH_MOD permission, the latter so it can see if something
>> unshared it.  But that would mean that everything is holding that
>> permission all the time and it's impossible to unshare it.
> 
> As things are now, GRAPH_MOD is shared by default on backing files
> (bdrv_format_default_perms()), but I don't understand why you need that
> everything takes GRAPH_MOD, you only need it when you're actually
> planning to change the graph (e.g a block job).
> 
>> So the solution might be to take it first, and release it after having
>> attached the child.  And maybe you'd say that for the removal case,
>> you'd have to take the GRAPH_MOD permission before you remove any
>> child.
> 
> Yes, something like that.
> 
>> But I think that's weird, and also, there's another issue still.
>>
>> What if you try to attach a read-only NBD server to B?  That should be
>> OK in my mind.  But it is a graph change, so it wouldn't be allowed if
>> GRAPH_MOD is a permission, even though commit doesn't care at all.
> 
> I haven't examined that case (I'm not familiar with NBD),

It doesn't matter whether it's NBD.  What matters is that commit doesn't
care about non-backing relationships.

> but if the
> only problem is that the permission system is unnecessarily strict in
> certain corner cases then maybe it's reasonable compromise _if_ there's
> no simpler alternative (I'm not saying that there isn't!).

But I'm very much saying there is.  GRAPH_MOD is complicated to
understand; to make it work, it would have to work differently from all
other permissions; and it isn't even what we want.

>> I don't think anything needs a way to generally block graph changes
>> around some node.  We only need to prevent changes to very specific
>> sets of edges.  This is something that the permission system just
>> cannot do.
> 
> But what would you do then?

What I wrote in the notes.  Have e.g. a counter in every BdrvChild that
can be incremented by everyone that means "If this is non-zero, do not
modify this BdrvChild".

At the start of the job, commit would go through the backing chain and
increment this counter on every backing link; and it would decrement it
after it's done.

>>>>>>   (We never quite knew what “taking the GRAPH_PERMISSION” or
>>>>>>   “unsharing the GRPAH_MOD permission” was supposed to mean.
>>>>>>   Figuring that out always took like half an our in any
>>>>>>   face-to-face meeting, and then we decided it was pretty much
>>>>>>   useless for any case we had at hand.)
>>>>>
>>>>> Yeah it's a bit unclear. It can mean "none of this node's children
>>>>> can be replaced / removed", which is what I'm using it for at the
>>>>> moment.
>>>>
>>>> Yes, but that's just not useful.  I don't think we have any case where
>>>> something would be annoyed by having its immediate child replaced
>>>> (other than the visible data changing, potentially).  Usually when we
>>>> say something (e.g. a block job) wants to prevent graph modification,
>>>> it's about changing edges that exist independently of the job (such as
>>>> a backing chain).
>>>
>>> Isn't that what I just said? You cannot change other edges <=> you
>>> cannot replace a node's children (or parents).
>>
>> Hm, true, but I think it's (1) hard to understand, (2) would require
>> non-trivial changes to the current model anyway (take GRAPH_MOD before
>> any modification, release it afterwards (which is quite different from
>> any other permission anyway)), and (3) would be too restrictive.
> 
> I don't understand (2), block-stream is already doing that (takes
> GRAPH_MOD when you create the block job, releases when the job is
> completed).

And isn't that just wrong?

You don't have to take it if you want to prevent modifications, you'd
have to unshare it.  Well, it does that, too, but there is no reason for
it to take that permission for a prolonged period of time.  You seem to
agree with me that you'd only take GRAPH_MOD while you actually do some
modification.  I suppose the idea may be that it takes the permission at
the start of the job so it can definitely successfully perform the
change at the end, because it has taken the permission already.  But
that's just wrong, because it takes the permission on the root node,
which is completely unaffected by stream's graph modifications anyway.

What is correct is that it doesn't share GRAPH_MOD for the intermediate
nodes it attaches to the block job.  But that probably doesn't do
anything because nothing cares to set/query/whatever GRAPH_MOD when
modifying the graph.


You seem to be implying that GRAPH_MOD works perfectly well.  But I
think it just does not.  I think most parts of the block layer
completely ignore it, and the rest is torn up in confusion what it's
even supposed to mean.

So even if we decided that GRAPH_MOD is good enough, there'd still be a
ton of work to do.  I very much think it'd be much more work (and not
just boring, tedious, but brain-mangling work that brings you to the
brink of a nervous breakdown) than just adding a BdrvChild.freeze
counter that actually makes sense and actually does what we need.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]