qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3] block/file-posix: do not fail on unlock byte


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [Qemu-devel] [PATCH v3] block/file-posix: do not fail on unlock bytes
Date: Fri, 29 Mar 2019 17:54:00 +0000

29.03.2019 20:39, Max Reitz wrote:
> On 29.03.19 18:31, Vladimir Sementsov-Ogievskiy wrote:
>> 29.03.2019 20:15, Max Reitz wrote:
>>> On 29.03.19 12:04, Vladimir Sementsov-Ogievskiy wrote:
>>>> bdrv_replace_child() calls bdrv_check_perm() with error_abort on
>>>> loosening permissions. However file-locking operations may fail even
>>>> in this case, for example on NFS. And this leads to Qemu crash.
>>>>
>>>> Let's avoid such errors. Note, that we ignore such things anyway on
>>>> permission update commit and abort.
>>>>
>>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>
>>>> ---
>>>>    block/file-posix.c | 12 ++++++++++++
>>>>    1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>>> index db4cccbe51..1cf4ee49eb 100644
>>>> --- a/block/file-posix.c
>>>> +++ b/block/file-posix.c
>>>> @@ -815,6 +815,18 @@ static int raw_handle_perm_lock(BlockDriverState *bs,
>>>>    
>>>>        switch (op) {
>>>>        case RAW_PL_PREPARE:
>>>> +        if ((s->perm | new_perm) == s->perm &&
>>>> +            (s->shared_perm & new_shared) == s->shared_perm)
>>>> +        {
>>>> +            /*
>>>> +             * We are going to unlock bytes, it should not fail. If it 
>>>> fail due
>>>> +             * to some fs-dependent permission-unrelated reasons (which 
>>>> occurs
>>>> +             * sometimes on NFS and leads to abort in bdrv_replace_child) 
>>>> we
>>>> +             * can't prevent such errors by any check here. And we ignore 
>>>> them
>>>> +             * anyway in ABORT and COMMIT.
>>>> +             */
>>>> +            return 0;
>>>> +        }
>>>>            ret = raw_apply_lock_bytes(s, s->fd, s->perm | new_perm,
>>>>                                       ~s->shared_perm | ~new_shared,
>>>>                                       false, errp);
>>>
>>> Help me understand the exact issue, please.  I understand that there are
>>> operations like bdrv_replace_child() that pass &error_abort to
>>> bdrv_check_perm() because they just loosen the permissions, so it should
>>> not fail.
>>>
>>> However, if the whole effect really would be to loosen permissions,
>>> raw_apply_lock_bytes() wouldn't have failed here in PREPARE anyway:
>>> @unlock is passed as false, so no bytes will be unlocked.  And if
>>> permissions are just loosened (as your condition checks), it should not
>>> lock any bytes.
>>
>> but it does qemu_lock_fd unconditionally on any locked byte (it may be 
>> loosening,
>> but not of all bytes).
> 
> It only locks bytes that aren't already locked ("!(locked_perm & bit)").
>   So if (s->perm | new_perm) & ~s->locked_perm == 0 (same for shared), it

Oops, missed that. We don't have this code in rhel base..

> won't lock anything.  That's the same as s->perm | new_perm == s->perm
> and s->perm == s->locked_perm.  So if this patch prevents qemu_lock_fd()
> errors in raw_apply_lock_bytes(), s->perm must be different from
> s->locked_perm.
> 
>> and fcntl is called, which may fail.
> 
> Hm, yeah, and as Kevin said, raw_check_lock_bytes() if new_perm and
> new_shared are already covered (though this could be fixed by passing
> new_perm & ~s->perm and... new_shared | ~s->shared_perm? to
> raw_check_lock_bytes).
> 
> It just seems to me that we sometimes do end up in a position where
> s->perm and s->locked_perm (or ~s->shared_perm and
> s->locked_shared_perm) do not align, which should not be happening: This
> would mean that the block layer thinks the permission locks are taken,
> but file-posix disagrees.

So if it is it's possibly not my case, as all my crashes are on old code, which
don't have this additional checking  in raw_apply_lock_bytes

> 
> Max
> 
>>> So why does it attempt lock any bytes in the first place?  There must be
>>> some discrepancy between s->perm and s->locked_perm, or ~s->shared_perm
>>> and s->locked_shared_perm.  How does that occur?
>>>
>>> Max
>>>
>>
>>
> 
> 


-- 
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]