qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 03/20] fuse: Implement standard FUSE operations


From: Max Reitz
Subject: Re: [PATCH v2 03/20] fuse: Implement standard FUSE operations
Date: Thu, 15 Oct 2020 18:04:37 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.11.0

On 15.10.20 17:58, Kevin Wolf wrote:
> Am 15.10.2020 um 17:18 hat Max Reitz geschrieben:
>> On 15.10.20 11:46, Kevin Wolf wrote:
>>> Am 22.09.2020 um 12:49 hat Max Reitz geschrieben:
>>>> This makes the export actually useful instead of only producing errors
>>>> whenever it is accessed.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
> 
>>>> +/**
>>>> + * Handle client reads from the exported image.
>>>> + */
>>>> +static void fuse_read(fuse_req_t req, fuse_ino_t inode,
>>>> +                      size_t size, off_t offset, struct fuse_file_info 
>>>> *fi)
>>>> +{
>>>> +    FuseExport *exp = fuse_req_userdata(req);
>>>> +    int64_t length;
>>>> +    void *buf;
>>>> +    int ret;
>>>> +
>>>> +    /**
>>>> +     * Clients will expect short reads at EOF, so we have to limit
>>>> +     * offset+size to the image length.
>>>> +     */
>>>> +    length = blk_getlength(exp->common.blk);
>>>> +    if (length < 0) {
>>>> +        fuse_reply_err(req, -length);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    size = MIN(size, FUSE_MAX_BOUNCE_BYTES);
>>>
>>> "Read should send exactly the number of bytes requested except on EOF or
>>> error, otherwise the rest of the data will be substituted with zeroes."
>>
>> :(
>>
>>> Do we corrupt huge reads with this, so that read() succeeds, but the
>>> buffer is zeroed above 64M instead of containing the correct data? Maybe
>>> we should return an error instead?
>>
>> Hm.  It looks like there is a max_read option obeyed by the kernel
>> driver, and it appears it’s set by implementing .init() and setting
>> fuse_conn_info.max_read (also, “for the time being” it has to also set
>> in the mount options passed to fuse_session_new()).
>>
>> I’m not sure whether that does anything, though.  It appears that
>> whenever I do a cached read, it caps out at 128k (which is what
>> cuse_prep_data() in libfuse sets; though increasing that number there
>> doesn’t change anything, so I think that’s just a coincidence), and with
>> O_DIRECT, it caps out at 1M.
>>
>> But at least that would be grounds to return an error for >64M requests.
>>  (Limiting max_read to 64k does limit all read requests to 64k.)
> 
> Yes, setting max_read and making larger requests an error seems like a
> good solution.
> 
>> Further investigation is needed.
> 
> If you want :-)

Not really, but 128k per request is a bit sad.

[...]

>>>> +/**
>>>> + * Let clients flush the exported image.
>>>> + */
>>>> +static void fuse_flush(fuse_req_t req, fuse_ino_t inode,
>>>> +                       struct fuse_file_info *fi)
>>>> +{
>>>> +    FuseExport *exp = fuse_req_userdata(req);
>>>> +    int ret;
>>>> +
>>>> +    ret = blk_flush(exp->common.blk);
>>>> +    fuse_reply_err(req, ret < 0 ? -ret : 0);
>>>> +}
>>>
>>> This seems to be an implementation for .fsync rather than for .flush.
>>
>> Wouldn’t fsync entail a drain?
> 
> Hm, the libfuse documentation doesn't say anything about draining. I
> suppose this is because if requests need to be drained, it will do so in
> the kernel. But I haven't checked the code.

Hmm, well, yeah...  A sync doesn’t necessarily mean settling all
in-flight requests.

> So I expect that calling fsync() on the lower layer does everything that
> is needed. Which is bdrv_flush() in QEMU.
> 
>> Or is it the opposite, flush should just drain and not invoke
>> blk_flush()?  (Sorry, this all gets me confused every time.)
> 
> I'm just relying on the libfuse documentation there for flush:
> 
> "This is called on each close() of the opened file."

Ah, OK.

> and
> 
> "NOTE: the name of the method is misleading, since (unlike fsync) the
> filesystem is not forced to flush pending writes. One reason to flush
> data is if the filesystem wants to return write errors during close.
> However, such use is non-portable because POSIX does not require close
> to wait for delayed I/O to complete."
> 
>> (Though I do think .fsync should both flush and drain.)
>>
>>> Hmm, or maybe actually for both? We usually do bdrv_flush() during
>>> close, so it would be consistent to do the same here. It's our last
>>> chance to report an error to the user before the file is closed.
>>
>> I don’t understand what you mean.  What is “the same”?  Closing the
>> image?  Or indeed having .flush() be implemented with blk_flush()?
> 
> Implementing .flush(), which will be called when the image is closed,
> with blk_flush().
> 
> And still doing blk_flush() for .fsync, of course.

OK.

>> Do you mean that other parties may do the same as qemu does, i.e.
>> flush files before they are closed, which is why we should anticipate
>> the same and give users a chance to see errors?
> 
> I'm not exactly sure what failure of .flush() would look like for users.
> Probably close() failing, so they don't have to do anything special,
> just check their return values.

Checking return values on close()?  Sounds special to me. O:)

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]