[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev RFC959 non-compliance in Lynx hangs the client
From: |
Klaus Weide |
Subject: |
Re: lynx-dev RFC959 non-compliance in Lynx hangs the client |
Date: |
Sat, 11 Sep 1999 20:14:45 -0500 (CDT) |
On Sat, 11 Sep 1999, Gregory A Lundberg wrote:
> I believe the problem wiht the libwww FTP implementation is what is called
> 'nominal realism'.
Actually, I think you are suffering from it too, only more so. :)
(see below)
Besides, it would make more sense to claim that *my* interpretation of the
RFCs suffers from it. There's not much of a connection between that and
how libwww implements an FTP client. (I believe the 'close' calls in the
_client_ code are not in question, only the meaning of 'close' to the
server, which affects the client rather indirectly. Accuse the libwww
implementation of 'pragmatism' instead if you wish, it's doing things in
a way that used to reliably work with existing servers...)
> In this case I mean the confusion of the RFC 959 (et al) term 'close' with
> the C function of the same name: close().
>
> The FTP RFCs do not make reference to C. Rather, they quite specifically
> refer the reader to the TELNET and TCP RFCs.
Yes that's true. It is also true that they do not make explicit reference
to CLOSED or any other internal states of the TC protocol machine.
> In those documents we find the definition of 'closed' in reference to a TCP
> connection:
>
> RFC 793 3.2 Terminology (pg 21)
>
> A connection progresses through a series of states during its
> lifetime. The states are: LISTEN, SYN-SENT, SYN-RECEIVED,
> ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, LAST-ACK,
> TIME-WAIT, and the fictional state CLOSED. CLOSED is fictional
> because it represents the state when there is no TCB, and therefore,
> no connection. Briefly the meanings of the states are:
> [...]
>
> CLOSED - represents no connection state at all.
>
> A TCP connection progresses from one state to another in response to
> events. The events are the user calls, OPEN, SEND, RECEIVE, CLOSE,
> ABORT, and STATUS; the incoming segments, particularly those
> containing the SYN, ACK, RST and FIN flags; and timeouts.
>
>
> Hence, we find that the definition of 'close' for the FTP is "no connection
> state at all".
You are looking at the wrong CLOSE here. Or rather, you are looking at the
wrong word entirely, CLOSED not CLOSE. You are even quoting the sentence
that mentions the one you, as a TCP user, should be concerned with, the
'user call' CLOSE.
RFC 973:
1.4
The interface between an application process and the TCP is
illustrated in reasonable detail. This interface consists of a set of
calls much like the calls an operating system provides to an
application process for manipulating files. For example, there are
calls to open and close connections and to send and receive data on
established connections.
2.4 Interfaces
The TCP/user interface provides for calls made by the user on the TCP
to OPEN or CLOSE a connection, to SEND or RECEIVE data, or to obtain
STATUS about a connection. These calls are like other calls from user
programs on the operating system, for example, the calls to open, read
from, and close a file.
This CLOSE doesn't have much to do with the state CLOSED, except that
it triggers a series of TCP state transitions that normally should
end up in state CLOSED. To the TCP it is just one event. The other
side of the interface is not well specified, for example when the 'call'
should return. But there is certainly no requirement here that CLOSE has
to wait until a certain TCP state has been reached. In the absense of
such requirements, which *would* have been made if they were necessary,
it stands to reason that CLOSE (as an action performed by the TCP's user)
has done its job as soon as CLOSE, as an event seen by the TCP layer, has
been delivered.
This is what RFC 973 defines as the *interface* between TCP and its user.
All protocols layered on top of TCP should be assumed to use this 'public'
meaning of CLOSE etc., not some knowledge about TCP internals like a
connection state, when their specifications talk about closing
connections (unless they specifically override this meaning).
Whether or not the Unix (etc.) close(), or some other function like
shutdown(), does the job of CLOSE in this sense is still open. Is the
application program done with the 'close' that's required in various parts
of the FTP spec when close() has returned? Probably not always. If there
is still data waiting to be transmitted out, close() may return before
the CLOSE event has happened in the TCP state diagram. In that case
return from close() does not mean yet that the caller has 'CLOSE'd the
connection. (But if the state reached is FIN WAIT-1 or later, the
caller *has* 'CLOSE'd the connection as far as the caller is concerned.)
On the other hand, interpreting the word 'close' in the FTP spec with
an implied requirement that a specific connection state has to be reached
before 'close' can be said to be done, just looks unreasonable, arbitrary,
and untenable. Some problems with it:
(1) _Which side_'s connection state?
Although LISTEN .. CLOSED are introduced as states of 'the connection',
they clearly only describe one side's view of the connection. In general
the exact state on the other side is unknown and unknowable.
In particular, one side may be in the 'no state' CLOSED state while the
other isn't.
(2) If that's the meaning of the verb 'to close' in the FTP spec, then it
ought to apply everywhere. I.e. for every mention of closing, of either
the control or data connection, by either the client or server (or 2nd
server), a close-that-waits-for-some-state ought to be implied.
I don't think that's reasonable at all. There'd be a lot more hanging
processes around if people implemented it...
(3) _Which_ state?
You say CLOSED, but then you have to immediately take back most of it
with the rationalization below:
> Pragmatically, this was found to be a _bit_ to restrictive. The TCP
> LAST-ACK and TIME-WAIT states were causing far too long a wait between data
> transfers. For this reason, the requirement was added that a new PORT or
> PASV command must preceed each transfer command.
This is not a new problem that was 'found' after RFC 959 was published.
RFC 959 is well aware of the problem. The difference between it and
RFC 1123 is only that the latter mandates (SHOULD level) the PORT or PASV
and is more explicit about when it needs to be repeated, while the former
(in 3.3) only tells the reader that this is one solution.
I don't see how this can be used to argue for your meaning of the verb
'close'. I understand as your meaning that 'to close' "really" means
to (also) wait until state CLOSED is reached, and that waiting only
until LAST-ACK or TIME-WAIT reached is just a pragmatic optimization.
But the formulation in RFC 1123 4.1.2.5 supports just the opposite:
DISCUSSION:
This is required because of the long delay after a TCP
connection is closed until its socket pair can be
reused, to allow multiple transfers during a single FTP
session.
The delay occurs *after* the connection is closed. IOW, the states
that cause delay are not conceptually part of closing, otherwise it
should say "the long delay while a TCP connection is closed" or similar.
> By choosing a new port
> pair, the connection was, by definition, CLOSED and ready to be established
> again.
For the topic at hand, i.e. what does the server have to do to 'close' a
connection, this just avoids the question. It's not the same connection.
> To properly implement the FTP, the server (who is assigned this
> responsibilty) must have some way to determine the data connection is in
> the CLOSED state.
This really is at the heart of the question. Is this or is this not the
server's responsibility?
I say, of course not. It's the TCP layer's responsibility to react to the
CLOSE as required by TCP. The FTP server's responsibility is only to do
the CLOSE call where the FTP protocol asks for closing.
The FTP protocol doesn't have a requirement to wait for a specific state,
and if this was meant to be required the RFCs would say it.
Why would it be important that FTP has reached the CLOSED state?
Don't you trust the FTP layer to do its job? Wouldn't it be more
important to know that, if anything, the *other* side's stack has
reached a specific state?
> Or at least that the connection is in the LAST-ACK or
> TIME-WAIT states, since those represent simply timeouts and no network
> traffic.
But they are there, and distinct from CLOSE, for good reasons. As I
understand it they exist exactly because there *might be* still network
traffic, in form of errant packets etc. Not that that's terribly relevant
here. But if 'to close the connection' really encompasses 'to wait until
a specific state', then that final state ought to be really CLOSED, not
something else.
> The question is how to detect this. And, for WU-FTPD, how to detect it in
> a portable way. By use of the shutdow() function, we can begin the TCP
> CLOSE on the connection. What we need is some way to determine when the
> remote (client or server) has also closed (via shutdown() or close()) the
> connection.
The question is first _whether_ to detect it.
Think of it this way: if it's hard to do in a portable way, there is
probably a good reason for it. It violates the separation of tasks
between protocol layers. It can't be something the FTP protocol requires.
As I said, you may still have to do something to ensure that the close()
actually becomes a CLOSE (i.e. does send a FIN). As well as ensuring
that all data are sent before that. But that's a different and hopefully
smaller problem. Shouldn't SO_LINGER do just that?
> The suggested method .. and I see no other .. was to read()
> from the data connection, awaiting an EOF indication, with an appropriate
> timeout, of course, to allow for complete loss of communications.
>
> If someone has some better means to _portably_ detect that the remote end
> has FIN/ACK'd our FIN, I would be very interested to hear about it.
Don't do it, just make sure you have your part of 'closing'. If you
know you have sent the data and know TCP has got the CLOSE event, consider
it done. TCP will retry sending the FIN until it gets ACK'd.
To quote once more from RFC 959:
3.5. ERROR RECOVERY [AND RESTART]
There is no provision for detecting bits lost or scrambled in data
transfer; this level of error control is handled by the TCP.
and from RFC 793:
2.6. Reliable Communication
A stream of data sent on a TCP connection is delivered reliably and in
order at the destination.
Transmission is made reliable via the use of sequence numbers and
acknowledgments.
AFAIK, delivery of the FIN (EOF for data sent) is part of what's covered
by the reliablity guarantee of TCP. If this 'bit' gets lost or scrambled
TCP will retry, the client has the means necessary to detect whether it
has received the full transmission yet (just as for data bits). If the
client defeats this it's not your problem, it's out of your hands.
(The most _portable_ way to examine the connection state would probably
be to call /s?bin/netstat externally and parse its output... No this
is not a serious suggestion.)
There may be some serious holes in my understanding of TCP, if you find
them please point them out. I haven't (re-)read all of the TCP specs,
and am quite fuzzy about what *exactly* close does. Please let me know
what I missed.
Klaus
- lynx-dev RFC959 non-compliance in Lynx hangs the client, Gregory A Lundberg, 1999/09/07
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, David Woolley, 1999/09/09
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, David Woolley, 1999/09/09
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, Gregory A Lundberg, 1999/09/09
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, Klaus Weide, 1999/09/10
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, Gregory A Lundberg, 1999/09/10
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, Klaus Weide, 1999/09/11
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, David Woolley, 1999/09/12
- Re: lynx-dev RFC959 non-compliance in Lynx hangs the client, Klaus Weide, 1999/09/10
2.8.3dev.8 patch 3 (was: lynx-dev RFC959 non-compliance), Klaus Weide, 1999/09/07