qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/3] block/file-posix: Work around XFS bug


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [RFC 0/3] block/file-posix: Work around XFS bug
Date: Mon, 28 Oct 2019 10:07:13 +0000

28.10.2019 12:56, Max Reitz wrote:
> On 28.10.19 10:30, Max Reitz wrote:
>> On 28.10.19 10:24, Max Reitz wrote:
>>> On 27.10.19 13:35, Stefan Hajnoczi wrote:
>>>> On Fri, Oct 25, 2019 at 11:58:46AM +0200, Max Reitz wrote:
>>>>> As for how we can address the issue, I see three ways:
>>>>> (1) The one presented in this series: On XFS with aio=native, we extend
>>>>>      tracked requests for post-EOF fallocate() calls (i.e., write-zero
>>>>>      operations) to reach until infinity (INT64_MAX in practice), mark
>>>>>      them serializing and wait for other conflicting requests.
>>>>>
>>>>>      Advantages:
>>>>>      + Limits the impact to very specific cases
>>>>>        (And that means it wouldn’t hurt too much to keep this workaround
>>>>>        even when the XFS driver has been fixed)
>>>>>      + Works around the bug where it happens, namely in file-posix
>>>>>
>>>>>      Disadvantages:
>>>>>      - A bit complex
>>>>>      - A bit of a layering violation (should file-posix have access to
>>>>>        tracked requests?)
>>>>
>>>> Your patch series is reasonable.  I don't think it's too bad.
>>>>
>>>> The main question is how to detect the XFS fix once it ships.  XFS
>>>> already has a ton of ioctls, so maybe they don't mind adding a
>>>> feature/quirk bit map ioctl for publishing information about bug fixes
>>>> to userspace.  I didn't see another obvious way of doing it, maybe a
>>>> mount option that the kernel automatically sets and that gets reported
>>>> to userspace?
>>>
>>> I’ll add a note to the RH BZ.
>>>
>>>> If we imagine that XFS will not provide a mechanism to detect the
>>>> presence of the fix, then could we ask QEMU package maintainers to
>>>> ./configure --disable-xfs-fallocate-beyond-eof-workaround at some point
>>>> in the future when their distro has been shipping a fixed kernel for a
>>>> while?  It's ugly because it doesn't work if the user installs an older
>>>> custom-built kernel on the host.  But at least it will cover 98% of
>>>> users...
>>>
>>> :-/
>>>
>>> I don’t like it, but I suppose it would work.  We could also
>>> automatically enable this disabling option in configure when we detect
>>> uname to report a kernel version that must include the fix.  (This
>>> wouldn’t work for kernel with backported fixes, but those disappear over
>>> time...)
>> I just realized that none of this is going to work for the gluster case
>> brought up by Nir.  The affected kernel is the remote one and we have no
>> insight into that.  I don’t think we can do ioctls to XFS over gluster,
>> can we?
> 
> On third thought, we could try to detect whether the file is on a remote
> filesystem, and if so enable the workaround unconditionally.  I suppose
> it wouldn’t hurt performance-wise, given that it’s a remote filesystem
> anyway.
> 

I think, for remote, the difference may be even higher than for local, as cost
of writing real zeroes through the wire vs fast zero command is high.

Really, can we live with simple config option, is it so bad?


-- 
Best regards,
Vladimir

reply via email to

[Prev in Thread] Current Thread [Next in Thread]