[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 10/22] quorum: Implement .bdrv_recurse_can_replace()
From: |
Max Reitz |
Subject: |
Re: [PATCH 10/22] quorum: Implement .bdrv_recurse_can_replace() |
Date: |
Thu, 26 Sep 2019 13:17:35 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 |
On 25.09.19 16:12, Vladimir Sementsov-Ogievskiy wrote:
> 20.09.2019 18:27, Max Reitz wrote:
>> Signed-off-by: Max Reitz <address@hidden>
>> ---
>> block/quorum.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 62 insertions(+)
>>
>> diff --git a/block/quorum.c b/block/quorum.c
>> index 207054a64e..81b57dbae2 100644
>> --- a/block/quorum.c
>> +++ b/block/quorum.c
>> @@ -825,6 +825,67 @@ static bool
>> quorum_recurse_is_first_non_filter(BlockDriverState *bs,
>> return false;
>> }
>>
>> +static bool quorum_recurse_can_replace(BlockDriverState *bs,
>> + BlockDriverState *to_replace)
>> +{
>> + BDRVQuorumState *s = bs->opaque;
>> + int i;
>> +
>> + for (i = 0; i < s->num_children; i++) {
>> + /*
>> + * We have no idea whether our children show the same data as
>> + * this node (@bs). It is actually highly likely that
>> + * @to_replace does not, because replacing a broken child is
>> + * one of the main use cases here.
>> + *
>> + * We do know that the new BDS will match @bs, so replacing
>> + * any of our children by it will be safe. It cannot change
>> + * the data this quorum node presents to its parents.
>> + *
>> + * However, replacing @to_replace by @bs in any of our
>> + * children's chains may change visible data somewhere in
>> + * there. We therefore cannot recurse down those chains with
>> + * bdrv_recurse_can_replace().
>> + * (More formally, bdrv_recurse_can_replace() requires that
>> + * @to_replace will be replaced by something matching the @bs
>> + * passed to it. We cannot guarantee that.)
>> + *
>> + * Thus, we can only check whether any of our immediate
>> + * children matches @to_replace.
>> + *
>> + * (In the future, we might add a function to recurse down a
>> + * chain that checks that nothing there cares about a change
>> + * in data from the respective child in question. For
>> + * example, most filters do not care when their child's data
>> + * suddenly changes, as long as their parents do not care.)
>> + */
>> + if (s->children[i].child->bs == to_replace) {
>> + Error *local_err = NULL;
>> +
>> + /*
>> + * We now have to ensure that there is no other parent
>> + * that cares about replacing this child by a node with
>> + * potentially different data.
>> + */
>> + s->children[i].to_be_replaced = true;
>> + bdrv_child_refresh_perms(bs, s->children[i].child, &local_err);
>> +
>
> So we are trying to answer on a question "is it ok to replace" it, by
> cheating on
> permission system... Possibly, it's a problem of general design, and instead
> of
> examining one subtree, we should ask all parents of to_replace node, are
> they
> OK with such replacement..
I’m not sure whether it’s cheating.
We want to replace some node. A parent should be A-OK with that as long
as it hasn’t frozen its child link, and as long as it doesn’t care about
data changes (it should not have taken CONSISTENT_READ, and it must have
shared WRITE).
The only actual problem we have is that currently basically everything
takes CONSISTENT_READ (which is completely fine), but the only thing
that doesn’t is the mirror_top_bs, and that has exactly the problem of
“I can only get away without CONSISTENT_READ if it was me who unshared it”.
But that’s a different problem. I don’t think this is cheating.
> Another idea is that it's strange to check permissions somewhere else than in
> generic
> permission check functions. But I've no idea how to handle it in permission
> system.
I don’t check the permissions, though. I let quorum take what it needs
to allow changing one of its children.
What is a problem is that I should keep the permissions tightened until
the node is actually replaced and only then release them. But that
turned out to be a huge mess so I resorted to just double-checking
before mirror actually completes.
Max
signature.asc
Description: OpenPGP digital signature
- Re: [PATCH 07/22] blkverify: Implement .bdrv_recurse_can_replace(), (continued)
- [PATCH 11/22] block: Use bdrv_recurse_can_replace(), Max Reitz, 2019/09/20
- [PATCH 13/22] mirror: Double-check immediately before replacing, Max Reitz, 2019/09/20
- [PATCH 17/22] iotests: Add VM.assert_block_path(), Max Reitz, 2019/09/20
- [PATCH 10/22] quorum: Implement .bdrv_recurse_can_replace(), Max Reitz, 2019/09/20
- [PATCH 12/22] block: Remove bdrv_recurse_is_first_non_filter(), Max Reitz, 2019/09/20
- [PATCH 14/22] quorum: Stop marking it as a filter, Max Reitz, 2019/09/20
- [PATCH 15/22] mirror: Prevent loops, Max Reitz, 2019/09/20
- [PATCH 16/22] iotests: Use complete_and_wait() in 155, Max Reitz, 2019/09/20
- [PATCH 18/22] iotests: Resolve TODOs in 041, Max Reitz, 2019/09/20