qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/3] nbd: Use shutdown(SHUT_WR) after last item sent


From: Eric Blake
Subject: Re: [PATCH 3/3] nbd: Use shutdown(SHUT_WR) after last item sent
Date: Fri, 27 Mar 2020 12:42:21 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0

On 3/27/20 11:35 AM, Daniel P. Berrangé wrote:
On Fri, Mar 27, 2020 at 11:19:36AM -0500, Eric Blake wrote:
Although the remote end should always be tolerant of a socket being
arbitrarily closed, there are situations where it is a lot easier if
the remote end can be guaranteed to read EOF even before the socket
has closed.  In particular, when using gnutls, if we fail to inform
the remote end about an impending teardown, the remote end cannot
distinguish between our closing the socket as intended vs. a malicious
intermediary interrupting things, and may result in spurious error
messages.

Does this actually matter in the NBD case ?

It has an explicit NBD command for requesting shutdown, and once
that's processed, it is fine to just close the socket abruptly - I
don't see a benefit to a TLS shutdown sequence on top.

You're right that the NBD protocol has ways for the client to advertise it will be shutting down, AND documents that the server must be robust to clients that just abruptly disconnect after that point. But we don't have control over all such servers, and there may very well be a server that logs an error on abrupt closure, where it would be silent if we did a proper gnutls_bye. Which is more important: maximum speed in disconnecting after we expressed intent, or maximum attempt at catering to all sorts of remote implementations that might not be as tolerant as qemu is of an abrupt termination?

AFAIK, the TLS level clean shutdown is only required if the
application protocol does not have any way to determine an
unexpected shutdown itself.

'man gnutls_bye' states:

Note that not all implementations will properly terminate a TLS connec‐ tion. Some of them, usually for performance reasons, will terminate only the underlying transport layer, and thus not distinguishing between a malicious party prematurely terminating the connection and
       normal termination.

You're right that because the protocol has an explicit message, we can reliably distinguish any early termination prior to NBD_OPT_ABORT/NBD_CMD_DISC as being malicious, so the only case where it matters is if we have a premature termination after we asked for clean shutdown, at which point a malicious termination didn't lose any data. So on that front, I guess you are right that not using gnutls_bye isn't going to have much impact.


This is relevant for HTTP where the connection data stream may not
have a well defined end condition.

In the NBD case though, we have an explicit NBD_CMD_DISC to trigger
the disconnect. After processing that message, an EOF is acceptable
regardless of whether ,
before processing that message, any EOF is a unexpected.

           Or, we can end up with a deadlock where both ends are stuck
on a read() from the other end but neither gets an EOF.

If the socket has been closed abruptly why would it get stuck in
read() - it should see EOF surely ?

That's what I'm trying to figure out: the nbdkit testsuite definitely hung even though 'qemu-nbd --list' exited, but I haven't yet figured out whether the bug lies in nbdkit proper or in libnbd, nor whether a cleaner tls shutdown would have prevented the hang in a more reliable manner. https://www.redhat.com/archives/libguestfs/2020-March/msg00191.html

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]