|
From: | Vladimir Sementsov-Ogievskiy |
Subject: | Re: [PATCH v5 22/45] block: implemet bdrv_unref_tran() |
Date: | Tue, 21 Jun 2022 00:16:32 +0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.1 |
On 6/13/22 12:07, Hanna Reitz wrote:
On 30.03.22 23:28, Vladimir Sementsov-Ogievskiy wrote:Now nodes are removed during block-graph update transactions now? Look at bdrv_replace_child_tran: bdrv_unref() is simply postponed to commit phase. What is the problem with it? We want to make copy-before-write permissions strict: it should unshare write always, not only when it has at least one parent.Looking over this patch in not too much detail (because I find it rather complicated), it looks OK to me; but this reason for why we need it doesn’t really satisfy me. What is the problem with how CBW permissions work? Is that really the only reason for this patch?
Currently, CBW don't unshare write, when have no parent. It's kind of "inactive" state. That's not quite correct. For example, if we just don't have parents on start of the job, nothing prevents user of adding new parents that write directly to source, ignoring CBW filter. Of course, user is responsible for his actions. But ideally, CBW filter should guarantee, that we are doing correct thing. It become more important when we consider "snapshot-access" interface. CBW filter provides this interface, and it should guarantee that it works correctly. And to achieve this we want to effectively remove nodes during transaction, not just postpone removal to commit(). And that's in good sync with global concept: do all modifications first, than update permissions. The other way could be removing permissions from nodes "to be removed", but that seems less correct to me. Does these strong permissions for CBW worh a complexity? Good question) And actually it's hard to estimate it in such a big series. I can try to split this thing out of the series and see, could we at least postpone it, keeping for now only the interfaces with weaker protection.
But if so, we can't neither insert the filter nor remove it: To insert the filter, we should first do blockdev-add, and filter will unshare write on the child, so, blockdev-add will fail if disk is in use by guest. To remove the filter, we should first do a replace operations, which again leads to situation when the filter and old parent share one child, and all parent want write permission when the filter unshare it. The solution is first do both graph-modifying operations (add & replace, or replace & remove) and only then update permissions. But that is not possible with current method to transactionally remove the block node: if we just postpone bdrv_unref() to commit phase, than on prepare phase the node is not removed, and it still keep all permissions on its children. What to do? In general, I don't know. But it's possible to solve the problem for the block drivers that doesn't need access to their children on .bdrv_close(). For such drivers we can detach their children on prepare stage (still, postponing bdrv_close() call to commit phase). For this to work we of course should effectively reduce bs->refcnt on prepare phase as well. So, the logic of new bdrv_unref_tran() is: prepare: decrease refcnt and detach children if possible (and if refcnt is 0) commit: do bdrv_delete() if refcnt is 0 abort: restore children and refcnt What's the difficulty with it? If we want to transactionally (and with no permission change) remove nodes, we should understand that some nodes may be removed recursively, and finally we get several possible not deleted leaves, where permissions should be updated. How caller will know what to update? That leads to additional transaction-wide refresh_list variable, which is filled by various graph modifying function. So, user should declare referesh_list variable and do one or several block-graph modifying operations (that may probably remove some nodes), then user call bdrv_list_refresh_perms on resulting refresh_list. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@openvz.org>
-- Best regards, Vladimir
[Prev in Thread] | Current Thread | [Next in Thread] |