qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte


From: Jason Dillaman
Subject: Re: [PATCH 7/7] block/rbd: change request alignment to 1 byte
Date: Thu, 21 Jan 2021 14:42:00 -0500

On Wed, Jan 20, 2021 at 6:01 PM Peter Lieven <pl@kamp.de> wrote:
>
>
> > Am 19.01.2021 um 15:20 schrieb Jason Dillaman <jdillama@redhat.com>:
> >
> > On Tue, Jan 19, 2021 at 4:36 AM Peter Lieven <pl@kamp.de> wrote:
> >>> Am 18.01.21 um 23:33 schrieb Jason Dillaman:
> >>> On Fri, Jan 15, 2021 at 10:39 AM Peter Lieven <pl@kamp.de> wrote:
> >>>> Am 15.01.21 um 16:27 schrieb Jason Dillaman:
> >>>>> On Thu, Jan 14, 2021 at 2:59 PM Peter Lieven <pl@kamp.de> wrote:
> >>>>>> Am 14.01.21 um 20:19 schrieb Jason Dillaman:
> >>>>>>> On Sun, Dec 27, 2020 at 11:42 AM Peter Lieven <pl@kamp.de> wrote:
> >>>>>>>> since we implement byte interfaces and librbd supports aio on byte 
> >>>>>>>> granularity we can lift
> >>>>>>>> the 512 byte alignment.
> >>>>>>>> Signed-off-by: Peter Lieven <pl@kamp.de>
> >>>>>>>> ---
> >>>>>>>> block/rbd.c | 2 --
> >>>>>>>> 1 file changed, 2 deletions(-)
> >>>>>>>> diff --git a/block/rbd.c b/block/rbd.c
> >>>>>>>> index 27b4404adf..8673e8f553 100644
> >>>>>>>> --- a/block/rbd.c
> >>>>>>>> +++ b/block/rbd.c
> >>>>>>>> @@ -223,8 +223,6 @@ done:
> >>>>>>>> static void qemu_rbd_refresh_limits(BlockDriverState *bs, Error 
> >>>>>>>> **errp)
> >>>>>>>> {
> >>>>>>>>    BDRVRBDState *s = bs->opaque;
> >>>>>>>> -    /* XXX Does RBD support AIO on less than 512-byte alignment? */
> >>>>>>>> -    bs->bl.request_alignment = 512;
> >>>>>>> Just a suggestion, but perhaps improve discard alignment, max discard,
> >>>>>>> optimal alignment (if that's something QEMU handles internally) if not
> >>>>>>> overridden by the user.
> >>>>>> Qemu supports max_discard and discard_alignment. Is there a call to 
> >>>>>> get these limits
> >>>>>> from librbd?
> >>>>>> What do you mean by optimal_alignment? The object size?
> >>>>> krbd does a good job of initializing defaults [1] where optimal and
> >>>>> discard alignment is 64KiB (can actually be 4KiB now), max IO size for
> >>>>> writes, discards, and write-zeroes is the object size * the stripe
> >>>>> count.
> >>>> Okay, I will have a look at it. If qemu issues a write, discard, 
> >>>> write_zero greater than
> >>>> obj_size  * stripe count will librbd split it internally or will the 
> >>>> request fail?
> >>> librbd will handle it as needed. My goal is really just to get the
> >>> hints down the guest OS.
> >>>> Regarding the alignment it seems that rbd_dev->opts->alloc_size is 
> >>>> something that comes from the device
> >>>> configuration and not from rbd? I don't have that information inside the 
> >>>> Qemu RBD driver.
> >>> librbd doesn't really have the information either. The 64KiB guess
> >>> that krbd uses was a compromise since that was the default OSD
> >>> allocation size for HDDs since Luminous. Starting with Pacific that
> >>> default is going down to 4KiB.
> >> I will try to adjust these values as far as it is possible and makes sense.
> >> Is there a way to check the minimum supported OSD release in the backend 
> >> from librbd / librados?
> >
> > It's not a minimum -- RADOS will gladly access 1 byte writes as well.
> > It's really just the optimal (performance and space-wise). Sadly,
> > there is no realistic way to query this data from the backend.
>
> So you would suggest to advertise an optimal transfer length of 64k and max 
> transfer length of obj size * stripe count to the guest unless we have an API 
> in the future to query these limits from the backend?

I'll open a Ceph tracker ticket to expose these via the API in a future release.

> I would leave request alignment at 1 byte as otherwise Qemu will issue RMWs 
> for all write requests that do not align. Everything that comes from a guest 
> OS is very likely 4k aligned anyway.

My goal is definitely not to have QEMU do any extra work for splitting
or aligning IOs. I am really only trying to get hints passed down the
guest via the virtio drivers. If there isn't the plumbing in QEMU for
that yet, disregard.

> Peter
>
>


-- 
Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]