qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 15/36] block: use topological sort for permission update


From: Kevin Wolf
Subject: Re: [PATCH v2 15/36] block: use topological sort for permission update
Date: Wed, 27 Jan 2021 19:38:09 +0100

Am 27.11.2020 um 15:45 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Rewrite bdrv_check_perm(), bdrv_abort_perm_update() and bdrv_set_perm()
> to update nodes in topological sort order instead of simple DFS. With
> topologically sorted nodes, we update a node only when all its parents
> already updated. With DFS it's not so.
> 
> Consider the following example:
> 
>     A -+
>     |  |
>     |  v
>     |  B
>     |  |
>     v  |
>     C<-+
> 
> A is parent for B and C, B is parent for C.
> 
> Obviously, to update permissions, we should go in order A B C, so, when
> we update C, all parent permissions already updated.

I wondered for a moment why this order is obvious. Taking a permission
on A may mean that we need to take the permisson on C, too.

The answer is (or so I think) that the whole operation is atomic so the
half-updated state will never be visible to a caller, but this is about
calculating the right permissions. Permissions a node needs on its
children may depend on what its parents requested, but parent
permissions never depend on what children request.

Ok, makes sense.

> But with current
> approach (simple recursion) we can update in sequence A C B C (C is
> updated twice). On first update of C, we consider old B permissions, so
> doing wrong thing. If it succeed, all is OK, on second C update we will
> finish with correct graph. But if the wrong thing failed, we break the
> whole process for no reason (it's possible that updated B permission
> will be less strict, but we will never check it).
> 
> Also new approach gives a way to simultaneously and correctly update
> several nodes, we just need to run bdrv_topological_dfs() several times
> to add all nodes and their subtrees into one topologically sorted list
> (next patch will update bdrv_replace_node() in this manner).
> 
> Test test_parallel_perm_update() is now passing, so move it out of
> debugging "if".
> 
> We also need to support ignore_children in
> bdrv_check_parents_compliance().
> 
> For test 283 order of parents compliance check is changed.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
>  block.c                     | 103 +++++++++++++++++++++++++++++-------
>  tests/test-bdrv-graph-mod.c |   4 +-
>  tests/qemu-iotests/283.out  |   2 +-
>  3 files changed, 86 insertions(+), 23 deletions(-)
> 
> diff --git a/block.c b/block.c
> index 92bfcbedc9..81ccf51605 100644
> --- a/block.c
> +++ b/block.c
> @@ -1994,7 +1994,9 @@ static bool bdrv_a_allow_b(BdrvChild *a, BdrvChild *b, 
> Error **errp)
>      return false;
>  }
>  
> -static bool bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
> +static bool bdrv_check_parents_compliance(BlockDriverState *bs,
> +                                          GSList *ignore_children,
> +                                          Error **errp)
>  {
>      BdrvChild *a, *b;
>  
> @@ -2005,7 +2007,9 @@ static bool 
> bdrv_check_parents_compliance(BlockDriverState *bs, Error **errp)
>       */
>      QLIST_FOREACH(a, &bs->parents, next_parent) {
>          QLIST_FOREACH(b, &bs->parents, next_parent) {
> -            if (a == b) {
> +            if (a == b || g_slist_find(ignore_children, a) ||
> +                g_slist_find(ignore_children, b))

'a' should be checked in the outer loop, no reason to repeat the same
check all the time in the inner loop.

> +            {
>                  continue;
>              }
>  
> @@ -2034,6 +2038,29 @@ static void bdrv_child_perm(BlockDriverState *bs, 
> BlockDriverState *child_bs,
>      }
>  }
>  
> +static GSList *bdrv_topological_dfs(GSList *list, GHashTable *found,
> +                                    BlockDriverState *bs)

It would be good to have a comment that explains the details of the
contract.

In particular, this seems to require that @list is already topologically
sorted, and it's complete in the sense that if a node is in the list,
all of its children are in the list, too.

> +{
> +    BdrvChild *child;
> +    g_autoptr(GHashTable) local_found = NULL;
> +
> +    if (!found) {
> +        assert(!list);
> +        found = local_found = g_hash_table_new(NULL, NULL);
> +    }
> +
> +    if (g_hash_table_contains(found, bs)) {
> +        return list;
> +    }
> +    g_hash_table_add(found, bs);
> +
> +    QLIST_FOREACH(child, &bs->children, next) {
> +        list = bdrv_topological_dfs(list, found, child->bs);
> +    }
> +
> +    return g_slist_prepend(list, bs);
> +}
> +
>  static void bdrv_child_set_perm_commit(void *opaque)
>  {
>      BdrvChild *c = opaque;
> @@ -2098,10 +2125,10 @@ static void bdrv_child_set_perm_safe(BdrvChild *c, 
> uint64_t perm,
>   * A call to this function must always be followed by a call to 
> bdrv_set_perm()
>   * or bdrv_abort_perm_update().
>   */

One big source of confusion for me when trying to understand this was
that bdrv_check_perm() is a misnomer since commit f962e96150e and the
above comment isn't really accurate any more.

The function doesn't only check the validity of the new permissions in
advance to actually making the change, but it already updates the
permissions of all child nodes (however not of its root node).

So we have gone from the original check/set/abort model (which the
function names still suggest) to a prepare/commit/rollback model.

I think some comment updates are in order, and possibly we should rename
some functions, too.

> -static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> -                           uint64_t cumulative_perms,
> -                           uint64_t cumulative_shared_perms,
> -                           GSList *ignore_children, Error **errp)
> +static int bdrv_node_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> +                                uint64_t cumulative_perms,
> +                                uint64_t cumulative_shared_perms,
> +                                GSList *ignore_children, Error **errp)
>  {
>      BlockDriver *drv = bs->drv;
>      BdrvChild *c;
> @@ -2166,21 +2193,43 @@ static int bdrv_check_perm(BlockDriverState *bs, 
> BlockReopenQueue *q,
>      /* Check all children */
>      QLIST_FOREACH(c, &bs->children, next) {
>          uint64_t cur_perm, cur_shared;
> -        GSList *cur_ignore_children;
>  
>          bdrv_child_perm(bs, c->bs, c, c->role, q,
>                          cumulative_perms, cumulative_shared_perms,
>                          &cur_perm, &cur_shared);
> +        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);

This "added" line is actually old code. What is removed here is the
recursive call of bdrv_check_update_perm(). This is what the code below
will have to replace.

> +    }
> +
> +    return 0;
> +}
> +
> +static int bdrv_check_perm(BlockDriverState *bs, BlockReopenQueue *q,
> +                           uint64_t cumulative_perms,
> +                           uint64_t cumulative_shared_perms,
> +                           GSList *ignore_children, Error **errp)
> +{
> +    int ret;
> +    BlockDriverState *root = bs;
> +    g_autoptr(GSList) list = bdrv_topological_dfs(NULL, NULL, root);
> +
> +    for ( ; list; list = list->next) {
> +        bs = list->data;
> +
> +        if (bs != root) {
> +            if (!bdrv_check_parents_compliance(bs, ignore_children, errp)) {
> +                return -EINVAL;
> +            }

At this point bs still had the old permissions, but we don't access
them. As we're going in topological order, the parents have already been
updated if they were a child covered in bdrv_node_check_perm(), so we're
checking the relevant values. Good.

What about the root node? If I understand correctly, the parents of the
root nodes wouldn't have been checked in the old code. In the new state,
the parent BdrvChild already has to contain the new permission.

In bdrv_refresh_perms(), we already check parent conflicts, so no change
for all callers going through it. Good.

bdrv_reopen_multiple() is less obvious. It passes permissions from the
BDRVReopenState, without applying the permissions first. Do we check the
old parent permissions instead of the new state here?

> +            bdrv_get_cumulative_perm(bs, &cumulative_perms,
> +                                     &cumulative_shared_perms);
> +        }
>  
> -        cur_ignore_children = g_slist_prepend(g_slist_copy(ignore_children), 
> c);
> -        ret = bdrv_check_update_perm(c->bs, q, cur_perm, cur_shared,
> -                                     cur_ignore_children, errp);
> -        g_slist_free(cur_ignore_children);
> +        ret = bdrv_node_check_perm(bs, q, cumulative_perms,
> +                                   cumulative_shared_perms,
> +                                   ignore_children, errp);

We use the original ignore_children for every node in the sorted list.
The old code extends it with all nodes in the path to each node.

For the bdrv_check_update_perm() call that is now replaced with
bdrv_check_parents_compliance(), I think this was necessary because
bdrv_check_update_perm() always assumes adding a new edge, so if you
update one instead of adding it, you have to ignore it so that it can't
conflict with itself. This isn't necessary any more now because we just
update and then check for consistency.

For passing to bdrv_node_check_perm() it doesn't make a difference
anyway because the parameter is now unused (and should probably be
removed).

>          if (ret < 0) {
>              return ret;
>          }
> -
> -        bdrv_child_set_perm_safe(c, cur_perm, cur_shared, NULL);
>      }
>  
>      return 0;

A tricky patch to understand, but I think it's right for the most part.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]