qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] nbd/server: Suppress Broken pipe errors on abrupt disconn


From: Eric Blake
Subject: Re: [PATCH v2] nbd/server: Suppress Broken pipe errors on abrupt disconnection
Date: Tue, 14 Sep 2021 14:06:55 -0500
User-agent: NeoMutt/20210205-772-2b4c52

[IOn Tue, Sep 14, 2021 at 03:52:00PM +0100, Richard W.M. Jones wrote:
> On Tue, Sep 14, 2021 at 05:40:59PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> > 13.09.2021 18:19, Richard W.M. Jones wrote:
> > >$ rm -f /tmp/sock /tmp/pid
> > >$ qemu-img create -f qcow2 /tmp/disk.qcow2 1M
> > >$ qemu-nbd -t --format=qcow2 --socket=/tmp/sock --pid-file=/tmp/pid 
> > >/tmp/disk.qcow2 &
> > >$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()'
> > >qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write 
> > >to socket: Broken pipe
> > >$ killall qemu-nbd
> > >
> > >nbdsh is abruptly dropping the NBD connection here which is a valid
> > >way to close the connection.  It seems unnecessary to print an error
> > >in this case so this commit suppresses it.
> > >
> > >Note that if you call the nbdsh h.shutdown() method then the message
> > >was not printed:
> > >
> > >$ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()' -c 
> > >'h.shutdown()'
> >
> > My personal opinion, is that this warning doesn't hurt in general. I
> > think in production tools should gracefully shutdown any connection,
> > and abrupt shutdown is a sign of something wrong - i.e., worth
> > warning.
> >
> > Shouldn't nbdsh do graceful shutdown by default?
> 
> On the client side the only difference is that nbd_shutdown sends
> NBD_CMD_DISC to the server (versus simply closing the socket).  On the
> server side when the server receives NBD_CMD_DISC it must complete any
> in-flight requests, but there's no requirement for the server to
> commit anything to disk.  IOW you can still lose data even though you
> took the time to disconnect.

If you use NBD_CMD_FLUSH as the last command before NBD_CMD_DISC, then
you shouldn't have data loss (but it requires the server to support
flush).  And in general, while a server that does not flush data on
CMD_DISC is compliant, it is poor quality of implementation if it
strands data that easily, for a client that tried hard to exit
gracefully.

> 
> So I don't think there's any reason for libnbd to always gracefully
> shut down (especially in this case where there are no in-flight
> requests), and anyway it would break ABI to make that change and slow
> down the client in cases when there's nothing to clean up.

I agree that we don't want libnbd to always gracefully shut down by
default; end users can already choose a graceful shutdown when they
want.

At the same time, I would not be opposed to improving the libnbd and
nbdkit testsuite usage of libnbd to request graceful shutdown in
places where it is currently getting an abrupt disconnect merely
because we were lazy when writing the test.

> 
> > >Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
> > >---
> > >  nbd/server.c | 7 ++++++-
> > >  1 file changed, 6 insertions(+), 1 deletion(-)
> > >
> > >diff --git a/nbd/server.c b/nbd/server.c
> > >index 3927f7789d..e0a43802b2 100644
> > >--- a/nbd/server.c
> > >+++ b/nbd/server.c
> > >@@ -2669,7 +2669,12 @@ static coroutine_fn void nbd_trip(void *opaque)
> > >          ret = nbd_handle_request(client, &request, req->data, 
> > > &local_err);
> > >      }
> > >      if (ret < 0) {
> > >-        error_prepend(&local_err, "Failed to send reply: ");
> > >+        if (errno != EPIPE) {
> >
> > Both nbd_handle_request() and nbd_send_generic_reply() declares that
> > they return -errno on failure in communication with client. I think,
> > you should use ret here: if (ret != -EPIPE). It's safer: who knows,
> > does errno really set on all error paths of called functions? If
> > not, we may see here errno of some another previous operation.
> 
> Should we set errno = 0 earlier in nbd_trip()?  I don't really know
> how coroutines in qemu interact with thread-local variables though.

No, we don't need to set errno to 0 prior to a call except at points
where we expect errno to be reliable after the call; but nbd_trip()
does not have any guarantees of reliable errno in the first place
(instead, it captured errno into the return value prior to any point
where errno loses its reliability).


-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]