qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX


From: Vladimir Sementsov-Ogievskiy
Subject: Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX
Date: Wed, 5 Apr 2023 12:59:05 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.0

On 04.04.23 20:32, Hanna Czenczek wrote:
On 04.04.23 10:10, Vladimir Sementsov-Ogievskiy wrote:
On 03.04.23 16:33, Hanna Czenczek wrote:
(Sorry for the rather late reply... Thanks for the review!)

On 20.03.23 11:31, Vladimir Sementsov-Ogievskiy wrote:
On 17.03.23 20:50, Hanna Czenczek wrote:

[...]

diff --git a/block/io.c b/block/io.c
index 8974d46941..1e9cdba17a 100644
--- a/block/io.c
+++ b/block/io.c

[..]

+    pad->write = write;
+
      return true;
  }
  @@ -1545,6 +1561,18 @@ zero_mem:
    static void bdrv_padding_destroy(BdrvRequestPadding *pad)

Maybe, rename to _finalize, to stress that it's not only freeing memory.

Sounds good!

[...]

@@ -1552,6 +1580,101 @@ static void bdrv_padding_destroy(BdrvRequestPadding 
*pad)
      memset(pad, 0, sizeof(*pad));
  }
  +/*
+ * Create pad->local_qiov by wrapping @iov in the padding head and tail, while
+ * ensuring that the resulting vector will not exceed IOV_MAX elements.
+ *
+ * To ensure this, when necessary, the first couple of elements (up to three)

maybe, "first two-three elements"

Sure (here and...

[...]

+    /*
+     * If padded_niov > IOV_MAX, we cannot just concatenate everything.
+     * Instead, merge the first couple of elements of @iov to reduce the number

maybe, "first two-three elements"

...here).


+     * of vector elements as necessary.
+     */
+    if (padded_niov > IOV_MAX) {


[..]

@@ -1653,8 +1786,8 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
          flags |= BDRV_REQ_COPY_ON_READ;
      }
  -    ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad,
-                           NULL, &flags);
+    ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, false,
+                           &pad, NULL, &flags);
      if (ret < 0) {
          goto fail;
      }

a bit later:

tracked_request_end(&req);
bdrv_padding_destroy(&pad);


Now, the request is formally finished inside bdrv_padding_destroy().. Not sure, does it really violate something, but seems safer to swap these two calls.

I’d rather not, for two reasons: First, tracked requests are (as far as I 
understand) only there to implement request serialization, and so only care 
about metadata (offset, length, and type), which is not changed by changes to 
the I/O vector.

Second, even if the state of the I/O vector were relevant to tracked requests, 
I think it would actually be the other way around, i.e. the tracked request 
must be ended before the padding is finalized/destroyed.  The tracked request 
is about the actual request we submit to `child` (which is why 
tracked_request_begin() is called after bdrv_pad_request()), and that request 
is done using the modified I/O vector.  So if the tracked request had any 
connection to the request’s I/O vector (which it doesn’t), it would be to this 
modified one, so we mustn’t invalidate it via bdrv_padding_finalize() while the 
tracked request lives.

Or, said differently: I generally try to clean up things in the inverse way 
they were set up, and because bdrv_pad_requests() comes before 
tracked_request_begin(), I think tracked_request_end() should come before 
bdrv_padding_finalize().

Note, that it's wise-versa in bdrv_co_pwritev_part().

Well, and it’s this way here.  We agree that for clean-up, the order doesn’t 
functionally matter, so either way is actually fine.

For me it's just simpler to think that the whole request, including filling 
user-given qiov with data on read part is inside tracked_request_begin() / 
tracked_request_end().

It isn’t, though, because padding must be done before the tracked request is 
created.  The tracked request uses the request’s actual offset and length, 
after padding, so bdrv_pad_request() must always be done before (i.e., outside) 
tracked_request_begin().

And moving the last manipulation with qiov out of it breaks this simple thought.
Guest should not care of it, as it doesn't know about request tracking.. But 
what about internal code? Some code may depend on some requests be finished 
after bdrv_drained_begin() call, but now they may be not fully finished, and 
some data may be not copied back to original qiov.

You didn't answered here. Do you think that's wrong assumption for the user of 
drained sections?


I agree with your point about sequence of objects finalization, but maybe, that 
just shows that copying data back to qiov should not be a part of 
bdrv_padding_finalize(), but instead be a separate function, called before 
tracked_request_end().

But my thought is that copying back shouldn’t be done before 
tracked_request_end(), because copying back is not part of the tracked request. 
 What we track is the padded request, which uses a modified I/O vector, so 
undoing that modification shouldn’t be done while the tracked request lives.

I know I’m inconsistent with regards to bdrv_co_pwritev_part(), which is 
because it doesn’t matter.  My actual position is that tracked requests are 
about metadata, so undoing/finalizing the padding (including potentially 
copying data back) has nothing to do with a tracked request, so the order 
cannot of finalizing both cannot matter.

But you’re arguing for consistency, and my position on that is, if we want 
consistency, I’d finalize the tracked request first, and the padding second.  
This is also because tracking is done for serialization, so we should end it as 
soon as possible, so that concurrent requests are resumed quickly.  (Though I’m 
not sure if delaying it by a memcpy() matters for an essentially 
single-threaded block layer at this time.)

Hanna


--
Best regards,
Vladimir




reply via email to

[Prev in Thread] Current Thread [Next in Thread]