[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually wor
From: |
Tom Yan |
Subject: |
Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg |
Date: |
Sun, 6 Sep 2020 23:26:45 +0800 |
I don't disagree with your proposal, but the thing is, do we even need
another field/limit for case 1? For example, do we *really* need to
clamp sizes[2] (NBD_INFO_BLOCK_SIZE) in nbd/server.c and
max_io_sectors (SCSIBlockLimits) in hw/scsi/scsi-disk.c to to any kind
of "dynamic" limit?
Either way I can add another field (`max_pwrite`, maybe?) to
BlockLimits, as an infrastructure/hint, but what should be switched to
it, and what value should each block driver set it to, is probably
beyond what I can take.
IMHO my patches should be merged first (they are really just fixing a
regression and some bugs). If anyone finds it necessary and is capable
and willing to fix the insufficiency of BlockLimits, they can do it
afterwards with another patch series as an improvement. I can add a
patch to remove the clamping in nbd/server.c as a perhaps-temporary
measure to address the original issue, if desired.
On Sun, 6 Sep 2020 at 20:53, Maxim Levitsky <mlevitsk@redhat.com> wrote:
>
> On Sun, 2020-09-06 at 19:04 +0800, Tom Yan wrote:
> > Maybe you want to add some condition for this:
> > https://github.com/qemu/qemu/blob/v5.1.0/nbd/server.c#L659
> > Or not clamp it at all.
> >
> > On Sun, 6 Sep 2020 at 18:58, Tom Yan <tom.ty89@gmail.com> wrote:
> > > In commit 867eccfed84f96b54f4a432c510a02c2ce03b430, Levitsky appears
> > > to have assumed that the only "SCSI Passthrough" is `-device
> > > scsi-generic`, while the fact is there's also `-device scsi-block`
> > > (passthrough without the sg driver). Unlike `-device scsi-hd`, getting
> > > max_sectors is necessary to it (more precisely, hw_max_sectors might
> > > what matters, but BLKSECTGET reports max_sectors, so).
> > >
> > > I'm unsure about how qemu-nbd works, but the commit clearly wasn't the
> > > right approach to fix the original issue it addresses. (It should for
> > > example ignore the max_transfer if it will never matter in to it, or
> > > overrides it in certain cases; when I glimpsed over
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1647104, I don't see how
> > > it could be file-posix problem when it is reporting the right thing,
> > > regardless of whether "removing" the code helps.)
> > >
> > > I don't think we want to "mark" `-device scsi-block` as sg either. It
> > > will probably bring even more unexpected problems, because they are
> > > literally different sets of things behind the scene / in the kernel.
>
> Yes, I did overlook the fact that scsi-block is kind of hybrid passthough
> device,
> doing SG_IO for some things and regular IO for others.
>
> I don't think that my approach was wrong way to fix the problem, but as you
> found
> out, abusing 'bs->sg' hack (which I would be very happy to remove completely)
> wasn't a good idea.
> I actualy was aware of scsi-block and that it does SG_IO but it
> was forgotten some where on the way.
>
> So in summary what the problem is and what I think is the right solution:
>
>
> Each qemu block driver exposes block limits and assumes that they are the same
> for two IO interfaces the block driver can expose:
>
> 1. Regular qemu blk_pread/pwrite alike functions
> 2. blk_ioctl (tiny wrapper against SG_IO), supported by posix-raw on
> host block devices/sg char devices, and by iscsi
>
> The problem is that these two interfaces can have different block limits.
>
> I don't know about iscsi, but for files, doing regular IO is always unlimited,
> since it passes through the kernel block layer and segemented there on
> demand which is faster that doing it in userspace, while SG_IO is passed as is
> to the underlying SCSI device and lacks this segmentation.
>
> Regardless of how NBD uses these limits, I think that these limits should be
> correctly
> exposed by the block drivers, and therefore I propose is that each qemu block
> driver
> would expose a pair of block limits.
> One for the regular block IO, and other for SG_IO.
>
> Then block driver clients (like scsi devices that you mention, nbd, etc)
> can choose which limit to use, depending on which IO api they use.
> The scsi-generic, and scsi-block can use the SG_IO limits,
> while the rest can use the normal (unlimited for file I/O) limits, including
> the NBD.
>
> Best regards,
> Maxim Levitsky
>
> > >
> > > On Fri, 4 Sep 2020 at 10:09, Tom Yan <tom.ty89@gmail.com> wrote:
> > > > sg devices have different major/minor than their corresponding
> > > > block devices. Using sysfs to get max segments never really worked
> > > > for them.
> > > >
> > > > Fortunately the sg driver provides an ioctl to get sg_tablesize,
> > > > which is apparently equivalent to max segments.
> > > >
> > > > Signed-off-by: Tom Yan <tom.ty89@gmail.com>
> > > > ---
> > > > block/file-posix.c | 17 ++++++++++++++++-
> > > > 1 file changed, 16 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/block/file-posix.c b/block/file-posix.c
> > > > index 411a23cf99..9e37594145 100644
> > > > --- a/block/file-posix.c
> > > > +++ b/block/file-posix.c
> > > > @@ -1178,6 +1178,21 @@ static int sg_get_max_transfer_length(int fd)
> > > > #endif
> > > > }
> > > >
> > > > +static int sg_get_max_segments(int fd)
> > > > +{
> > > > +#ifdef SG_GET_SG_TABLESIZE
> > > > + long max_segments = 0;
> > > > +
> > > > + if (ioctl(fd, SG_GET_SG_TABLESIZE, &max_segments) == 0) {
> > > > + return max_segments;
> > > > + } else {
> > > > + return -errno;
> > > > + }
> > > > +#else
> > > > + return -ENOSYS;
> > > > +#endif
> > > > +}
> > > > +
> > > > static int get_max_transfer_length(int fd)
> > > > {
> > > > #if defined(BLKSECTGET) && defined(BLKSSZGET)
> > > > @@ -1268,7 +1283,7 @@ static void hdev_refresh_limits(BlockDriverState
> > > > *bs, Error **errp)
> > > > bs->bl.max_transfer = pow2floor(ret);
> > > > }
> > > >
> > > > - ret = get_max_segments(s->fd);
> > > > + ret = bs->sg ? sg_get_max_segments(s->fd) :
> > > > get_max_segments(s->fd);
> > > > if (ret > 0) {
> > > > bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
> > > > ret *
> > > > qemu_real_host_page_size);
> > > > --
> > > > 2.28.0
> > > >
>
>
- [PATCH v4 1/2] file-posix: split hdev_refresh_limits from raw_refresh_limits, Tom Yan, 2020/09/03
- [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Tom Yan, 2020/09/03
- Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Tom Yan, 2020/09/06
- Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Tom Yan, 2020/09/06
- Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Maxim Levitsky, 2020/09/06
- Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg,
Tom Yan <=
- Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Maxim Levitsky, 2020/09/06
- RE: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Dmitry Fomichev, 2020/09/06
- Re: [PATCH v4 2/2] file-posix: add sg_get_max_segments that actually works with sg, Tom Yan, 2020/09/06