qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] nbd/server: Suppress Broken pipe errors on abrupt disconnect


From: Kevin Wolf
Subject: Re: [PATCH] nbd/server: Suppress Broken pipe errors on abrupt disconnection
Date: Mon, 13 Sep 2021 17:07:12 +0200

Am 23.07.2021 um 17:49 hat Eric Blake geschrieben:
> On Thu, Jul 22, 2021 at 11:45:52AM +0100, Richard W.M. Jones wrote:
> > $ rm -f /tmp/sock /tmp/pid
> > $ qemu-img create -f qcow2 /tmp/disk.qcow2 1M
> > $ qemu-nbd -t --format=qcow2 --socket=/tmp/sock --pid-file=/tmp/pid 
> > /tmp/disk.qcow2 &
> > $ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()'
> > qemu-nbd: Disconnect client, due to: Failed to send reply: Unable to write 
> > to socket: Broken pipe
> > $ killall qemu-nbd
> > 
> > nbdsh is abruptly dropping the NBD connection here which is a valid
> > way to close the connection.  It seems unnecessary to print an error
> > in this case so this commit suppresses it.
> > 
> > Note that if you call the nbdsh h.shutdown() method then the message
> > was not printed:
> > 
> > $ nbdsh -u 'nbd+unix:///?socket=/tmp/sock' -c 'h.get_size()' -c 
> > 'h.shutdown()'
> 
> A client not shutting down cleanly might cause the server to leave the
> disk in an unspecified state prior to the next client (more
> concretely, a client that just disconnects instead of waiting for a
> flush to land may result in data loss from the point of view of that
> client when it reconnects, although the server was never in the
> wrong).

I think in such cases, clients must assume that all in-flight requests
have failed. Request failure means that the state is undefined. You
could have the old content, you could have the new content, or you could
have some random corruption.

> But for your _specific_ example here of a client that only performs
> read actions and does not modify the disk, there is obviously no data
> loss possible.
> 
> But you are also correct that a client that disconnects abruptly
> instead of cleanly is a common enough event that warning about it can
> just feel noisy.  Is this the sort of thing that users would want a
> command-line knob to opt in or out of those warnings (and what default
> should that knob take), or should this be something we just always
> ignore?  Or maybe we make the warning conditional on whether the
> client attempted any modification to the image, being silent on
> default to a client that merely reads, and only noisy for a client
> that attempted at least one write but disconnected before we could
> reply that the write or subsequent flush was complete.
> 
> qemu-storage-daemon has to answer the same question, so I'd like
> Kevin's take on the matter to make sure we pick an answer we are
> consistently happy with.

So I don't think I would make a difference between read-only and
read-write clients. The consideration whether we should print an error
message or not feels more like something that becomes relevant when
debugging a bug that we can't reproduce and just get a bunch of logs. I
feel that abrupt disconnects could in some cases be useful information
to have there.

Essentially it's something that you would configure with log levels, but
we don't really have that (and even if we had it, in practice management
tools would use one default setting). So I feel we have to decide for
one thing or the other. Since bugs involving NBD are probably something
you'll have to debug, maybe you should pick. I don't really mind either
way.

> > 
> > Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
> > ---
> >  nbd/server.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/nbd/server.c b/nbd/server.c
> > index b60ebc3ab6..0f86535b88 100644
> > --- a/nbd/server.c
> > +++ b/nbd/server.c
> > @@ -2668,7 +2668,11 @@ static coroutine_fn void nbd_trip(void *opaque)
> >          ret = nbd_handle_request(client, &request, req->data, &local_err);
> >      }
> >      if (ret < 0) {
> > -        error_prepend(&local_err, "Failed to send reply: ");
> > +        if (errno != EPIPE) {
> > +            error_prepend(&local_err, "Failed to send reply: ");
> > +        } else {
> > +            local_err = NULL;
> 
> This line should be error_free(local_err) to avoid a memleak.

Actually, you want both error_free(local_err) and local_err = NULL.

> > +        }
> >          goto disconnect;
> >      }

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]