[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v5 3/6] QIOChannelSocket: Implement io_writev_zerocopy & io_f
From: |
Daniel P . Berrangé |
Subject: |
Re: [PATCH v5 3/6] QIOChannelSocket: Implement io_writev_zerocopy & io_flush_zerocopy for CONFIG_LINUX |
Date: |
Tue, 23 Nov 2021 09:55:54 +0000 |
User-agent: |
Mutt/2.1.3 (2021-09-10) |
On Tue, Nov 23, 2021 at 01:46:44AM -0300, Leonardo Bras Soares Passos wrote:
> Hello Daniel,
>
> On Fri, Nov 12, 2021 at 7:54 AM Daniel P. Berrangé <berrange@redhat.com>
> wrote:
> [...]
> > > @@ -561,12 +577,15 @@ static ssize_t
> > > qio_channel_socket_writev_flags(QIOChannel *ioc,
> > > retry:
> > > ret = sendmsg(sioc->fd, &msg, flags);
> > > if (ret <= 0) {
> > > - if (errno == EAGAIN) {
> > > + switch (errno) {
> > > + case EAGAIN:
> > > return QIO_CHANNEL_ERR_BLOCK;
> > > - }
> > > - if (errno == EINTR) {
> > > + case EINTR:
> > > goto retry;
> > > + case ENOBUFS:
> > > + return QIO_CHANNEL_ERR_NOBUFS;
> >
> > Why does ENOBUFS need handling separately instead of letting
> > the error_setg_errno below handle it ?
> >
> > The caller immediately invokes error_setg_errno() again,
> > just with different error message.
> >
> > No code in this series ever looks at QIO_CHANNEL_ERR_NOBUFS
> > either, so we don't even need that special error return code
> > added AFAICT ?
> >
>
> The idea was to add a custom message for ENOBUFS return when sending
> with MSG_ZEROCOPY.
> I mean, having this message is important for the user to understand
> why the migration is failing, but it would
> not make any sense to have this message while a non-zerocopy sendmsg()
> returns with ENOBUFS.
>
> ENOBUFS : The output queue for a network interface was full. This
> generally indicates that the interface has stopped sending, but may be
> caused by transient congestion.
>
> As an alternative, I could add this message inside the switch, inside
> an if (flags & MSG_ZEROCOPY) on qio_channel_socket_writev_flags()
> instead of in it's caller.
> But for me it looks bloated, I mean, dealing with an error for
> ZEROCOPY only in the general function.
It is perfectly reasonable to check flags in this method.
> OTOH, if you think that it's a better idea to deal with every error in
> qio_channel_socket_writev_flags() instead of in the caller, I will
> change it for v6. Please let me know.
Yes, this method is already taking an ERror **errp parameter and
reporting a user facing error. If we need to report different
message text for ENOBUFS, it should be done in this method too.
The reason QIO_CHANNEL_ERR_BLOCK is special is because we are
explicitly not treating it as an error scenario at all. That's
different to the ENOBUFS case.
>
> > > }
> > > +
> > > error_setg_errno(errp, errno,
> > > "Unable to write to socket");
> > > return -1;
> > > @@ -670,6 +689,127 @@ static ssize_t qio_channel_socket_writev(QIOChannel
> > > *ioc,
> > > }
> > > #endif /* WIN32 */
> > >
> > > +
> > > +#ifdef CONFIG_LINUX
> > > +
> > > +static int qio_channel_socket_poll(QIOChannelSocket *sioc, bool zerocopy,
> > > + Error **errp)
> >
> > There's only one caller and it always passes zerocopy=true,
> > so this parmeter looks pointless.
>
> I did that for possible reuse of this function in the future:
> - As of today, this is certainly compiled out, but if at some point
> someone wants to use poll for something other
> than the reading of an zerocopy errqueue, it could be reused.
>
> But sure, if that's not desirable, I can remove the parameter (and the
> if clause for !zerocopy).
>
> >
> > > +{
> > > + struct pollfd pfd;
> > > + int ret;
> > > +
> > > + pfd.fd = sioc->fd;
> > > + pfd.events = 0;
> > > +
> > > + retry:
> > > + ret = poll(&pfd, 1, -1);
> > > + if (ret < 0) {
> > > + switch (errno) {
> > > + case EAGAIN:
> > > + case EINTR:
> > > + goto retry;
> > > + default:
> > > + error_setg_errno(errp, errno,
> > > + "Poll error");
> > > + return ret;
> >
> > return -1;
> >
> > > + }
> > > + }
> > > +
> > > + if (pfd.revents & (POLLHUP | POLLNVAL)) {
> > > + error_setg(errp, "Poll error: Invalid or disconnected fd");
> > > + return -1;
> > > + }
> > > +
> > > + if (!zerocopy && (pfd.revents & POLLERR)) {
> > > + error_setg(errp, "Poll error: Errors present in errqueue");
> > > + return -1;
> > > + }
> >
> > > +
> > > + return ret;
> >
> > return 0;
>
> In the idea of future reuse I spoke above, returning zero here would
> make this function always look like the poll timed out. Some future
> users may want to repeat the waiting if poll() timed out, or if
> (return > 0) stop polling.
Now that I'm looking again, we should not really use poll() at all,
as GLib provides us higher level APIs. We in fact already have the
qio_channel_wait() method as a general purpose helper for waiting
for an I/O condition to occur.;
> I understand the idea of testing SO_EE_CODE_ZEROCOPY_COPIED to be able
> to tell whenever zerocopy fell back to copying for some reason, but I
> don't see how this can be helpful here.
>
> Other than that I would do rv++ instead of rv=1 here, if I want to
> keep track of how many buffers were sent with zerocopy and how many
> ended up being copied.
Sure, we could do "ret > 0 == number of buffers that were copied"
as the API contract, rather than just treating it as a boolean.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
[PATCH v5 5/6] migration: Add migrate_use_tls() helper, Leonardo Bras, 2021/11/12
[PATCH v5 6/6] multifd: Implement zerocopy write in multifd migration (multifd-zerocopy), Leonardo Bras, 2021/11/12