Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX

From:	Hanna Czenczek
Subject:	Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX
Date:	Mon, 3 Apr 2023 15:33:08 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1

(Sorry for the rather late reply... Thanks for the review!)

On 20.03.23 11:31, Vladimir Sementsov-Ogievskiy wrote:

On 17.03.23 20:50, Hanna Czenczek wrote:


[...]

diff --git a/block/io.c b/block/io.c
index 8974d46941..1e9cdba17a 100644
--- a/block/io.c
+++ b/block/io.c


[..]

+    pad->write = write;
+
      return true;
  }
  @@ -1545,6 +1561,18 @@ zero_mem:
    static void bdrv_padding_destroy(BdrvRequestPadding *pad)


Maybe, rename to _finalize, to stress that it's not only freeing memory.


Sounds good!

[...]

@@ -1552,6 +1580,101 @@ static voidbdrv_padding_destroy(BdrvRequestPadding *pad)
      memset(pad, 0, sizeof(*pad));
  }
  +/*
+ * Create pad->local_qiov by wrapping @iov in the padding head andtail, while
+ * ensuring that the resulting vector will not exceed IOV_MAX elements.
+ *
+ * To ensure this, when necessary, the first couple of elements (upto three)
maybe, "first two-three elements"


Sure (here and...

[...]

+    /*
+     * If padded_niov > IOV_MAX, we cannot just concatenate everything.
+ * Instead, merge the first couple of elements of @iov to reducethe number
maybe, "first two-three elements"


...here).

+     * of vector elements as necessary.
+     */
+    if (padded_niov > IOV_MAX) {
[..]
@@ -1653,8 +1786,8 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild*child,
          flags |= BDRV_REQ_COPY_ON_READ;
      }
- ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset,&bytes, &pad,
-                           NULL, &flags);
+ ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes,false,
+                           &pad, NULL, &flags);
      if (ret < 0) {
          goto fail;
      }
a bit later:

tracked_request_end(&req);
bdrv_padding_destroy(&pad);
Now, the request is formally finished inside bdrv_padding_destroy()..Not sure, does it really violate something, but seems safer to swapthese two calls.

I’d rather not, for two reasons: First, tracked requests are (as far asI understand) only there to implement request serialization, and so onlycare about metadata (offset, length, and type), which is not changed bychanges to the I/O vector.

Second, even if the state of the I/O vector were relevant to trackedrequests, I think it would actually be the other way around, i.e. thetracked request must be ended before the padding isfinalized/destroyed. The tracked request is about the actual request wesubmit to `child` (which is why tracked_request_begin() is called afterbdrv_pad_request()), and that request is done using the modified I/Ovector. So if the tracked request had any connection to the request’sI/O vector (which it doesn’t), it would be to this modified one, so wemustn’t invalidate it via bdrv_padding_finalize() while the trackedrequest lives.

Or, said differently: I generally try to clean up things in the inverseway they were set up, and because bdrv_pad_requests() comes beforetracked_request_begin(), I think tracked_request_end() should comebefore bdrv_padding_finalize().

With that:

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>



PS, I feel here still exists small space for optimization:

The question is whether any optimization is really worth it, and I’m notsure it is. The bug has been in qemu for over two years, and becausethe only report I’ve seen about it came from our QE department, it seemslike a very rare case, so I find it more important for the code to be assimple as possible than to optimize.

move the logic to bdrv_init_padding(), and

1. allocate only one buffer
2. make the new collpase are to be attached to head or tail padding
3. avoid creating extra iov-slice, maybe with help of some newqemu_iovec_* API that can control number of copied/to-be-copied iovsand/or calculation number of iovs in qiov/qiov_offset/bytes slice

I’ve actually begun by trying to reuse the padding buffer, and tocollapse head/tail into it, but found it to be rather complicated. Seealso my reply to Stefan here:https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04774.html


Hanna

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX, Hanna Czenczek <=
- Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX, Vladimir Sementsov-Ogievskiy, 2023/04/04
  - Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX, Hanna Czenczek, 2023/04/04
    - Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX, Vladimir Sementsov-Ogievskiy, 2023/04/05
    - Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX, Hanna Czenczek, 2023/04/06
    - Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX, Vladimir Sementsov-Ogievskiy, 2023/04/06

Prev by Date: Re: [PATCH 04/11] qemu-options: finesse the recommendations around -blockdev
Next by Date: Re: [PATCH] MAINTAINERS: Remove and change David Gilbert maintainer entries
Previous by thread: an issue for device hot-unplug
Next by thread: Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX
Index(es):
- Date
- Thread