qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] curl: Allow reading after EOF


From: Kevin Wolf
Subject: Re: [RFC PATCH] curl: Allow reading after EOF
Date: Wed, 17 Mar 2021 17:38:08 +0100

Am 17.03.2021 um 16:46 hat Eric Blake geschrieben:
> On 3/17/21 10:32 AM, Eric Blake wrote:
> > On 3/17/21 10:17 AM, Kevin Wolf wrote:
> >> This makes the curl driver more consistent with file-posix in that it
> >> doesn't return errors any more for reading after the end of the remote
> >> file. Instead, zeros are returned for these areas.
> >>
> >> This inconsistency was reported in:
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1935061
> >>
> >> Note that the image used in this bug report has a corrupted snapshot
> >> table, which means that the qcow2 driver tries to do a zero-length read
> >> after EOF on its image file.
> >>
> >> The old behaviour of the curl driver can hardly be called a bug, but the
> >> inconsistency turned out to be confusing.
> >>
> >> Reported-by: Alice Frosi <afrosi@redhat.com>
> >> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >> ---
> >>
> >> It is not entirely clear to me if this is something we want to do. If we
> >> do care about consistency between protocol drivers, something like this
> >> should probably be done in block/io.c eventually - but that would
> >> require converting bs->total_sectors to byte granularity first.
> > 
> > Something that's been (low priority) on my todo list for a while.  NBD
> > has the same problem.
> 
> Actually, NBD has already been patched to fuzz around the lack of
> byte-accurateness in the block layer; see commit 9cf638508.  So doing
> something similar in the curl driver as a workaround until the block
> layer does it for everyone is tolerable, but does not scale.

Right, duplicating the same code in every protocol driver wouldn't be
great.

> >> Any opinions on what the most desirable semantics would be and whether
> >> we should patch individual drivers until we can have a generic solution?
> > 
> > In nbdkit, we took the following approach in the 'truncate' driver:
> > 
> > If presented with an image that is not a multiple of the desired block
> > size, we round the image size up (corner cases for images with sizes
> > near 2^63 where rounding would wrap to negative; and since qemu enforces
> > a max image size at 2^63-2^32 to avoid 32-bit operations ever
> > overflowing).  Reads of the virtual tail come back as zero, writes to
> > the virtual tail are allowed if they would write zero into the tail, and
> > fail with ENOSPC otherwise.
> 
> The current code in block/nbd.c does this for reads, but fails on EIO
> without regards to the content of what is being attempted to write into
> that tail.  I like the nbdkit behavior better.

I don't, honestly. Making the success of a write depend on the content
written is just too surprising. Is there a real use case behind this
that would justify it?

I think the file-posix behaviour (which is really just the kernel
behaviour) is the most sane: On regular files, which can be extended,
extend the image file on writes. On block devices, return -ENOSPC.

> > Doing that in the block layer makes more sense than doing it per-driver.
> > 
> > Thus, I'm not sure if I'm a fan of this patch.

What we need to decide is the semantics that we want to have.

If the behaviour implemented by this patch (or a similar patch) is what
we want, then I would say we should commit it, even if we're planning to
refactor things in the long term so that it's implemented in the generic
block layer.

> >>
> >>  block/curl.c | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/block/curl.c b/block/curl.c
> >> index 50e741a0d7..a8d87a1813 100644
> >> --- a/block/curl.c
> >> +++ b/block/curl.c
> >> @@ -898,6 +898,7 @@ out:
> >>  static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
> >>          uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
> >>  {
> >> +    BDRVCURLState *s = bs->opaque;
> >>      CURLAIOCB acb = {
> >>          .co = qemu_coroutine_self(),
> >>          .ret = -EINPROGRESS,
> >> @@ -906,6 +907,15 @@ static int coroutine_fn 
> >> curl_co_preadv(BlockDriverState *bs,
> >>          .bytes = bytes
> >>      };
> >>  
> >> +    if (offset > s->len || bytes > s->len - offset) {
> >> +        uint64_t req_bytes = offset > s->len ? 0 : s->len - offset;
> >> +        qemu_iovec_memset(qiov, req_bytes, 0, bytes - req_bytes);
> >> +        bytes = req_bytes;
> 
> In nbd.c, I also have:
>    if (offset >= client->info.size) {
>         assert(bytes < BDRV_SECTOR_SIZE);
> 
>     if (offset + bytes > client->info.size) {
>         assert(slop < BDRV_SECTOR_SIZE);
> 
> With those assertions added, I can give it

Hm, that requires splitting it into two blocks. Maybe I should just copy
the whole thing from the NBD driver. Makes it a bit longer, but then
it's more obviously doing the same.

Anyway, these assertions show that we have already decided on the
desired behaviour for reads because we already read zeros in
bdrv_aligned_preadv() as long as things are sector aligned. Changing
bs->total_sectors to bytes would result in the same behaviour as this
patch.

What we can try is how hard this conversion is really. If we can just do
it right away, it's probably preferable.

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]