|
From: | David Empson |
Subject: | Re: [lwip-users] socket: LWIP_ACCEPT problem? |
Date: | Fri, 07 Mar 2008 11:06:16 +1300 |
The issue you are encountering is due to standard
behaviour of TCP. When a client connects to
a server, the client specifies a local port number, and the server's well known
port number. These are used along with the IP addresses to uniquely
identify a TCP connection.
When the server end closes the connection first,
and the client reacts by immediately closing its side, the sequence of packets
that goes back and forth is like this:
1. Server sends FIN
2. Client responds with ACK
3. Client sends FIN (can be merged with the
previous ACK packet if the client is fast enough closing its
connection)
4. Server responds with ACK
Consider what happens if any of these packets is
lost.
1. If the server's FIN is lost, the server
will time out on the ACK response and resend the FIN. The client end
doesn't know the connection is closing until it gets the retried
FIN.
2. If the client's ACK is lost, the server
will time out on the ACK response and resend the FIN. The client end discards
the repeat FIN as a duplicate and resends the ACK.
3. If the client's FIN is lost, the client
will time out on the ACK response and resend the FIN. The server end doesn't
know that the client has closed its half of the connection until it gets the
retried FIN.
4. If the server's ACK is lost, the client will
time out on the ACK response and resend the FIN. The server end discards the
repeat FIN as a duplicate and resends the ACK.
There is a timeout on this last "TIMED_WAIT"
state, typically two minutes. During this time, the station which closed the
connection first (in this case, the server) is waiting in case the other station
missed the final ACK and it needs to be resent. At the end of that
timeout, it is assumed that the client got the ACK and
the server fully closes its connection.
If the server is in this state, and the client
tries to establish a new connection using the same IP address and port number at
both ends, then it is regarded as part of the same connection as the previous
one. A connection can't be reopened while it is established or in the closing
states (SYN is only valid in the opening stages), so the server rejects or
ignores the client's connection attempt.
Some TCP/IP stacks might be clever about this and
recognise a new connection (SYN) with the same port number by cutting short the
TIMED_WAIT delay and immediately closing the old connection. LWIP probably
doesn't do that.
The problem can easily be avoided by having the
client use a different port number for opening a new connection. The normal
convention for TCP is that the client allocates a random or incrementing port
number in the high range (49152+) for each new connection. Unless it happens to
pick the one which was used to the same port on the same server within the
last two minutes, this won't be confused with a previous connection in the
TIMED_WAIT state.
If the client end closes the connection first, then
the server goes straight to the CLOSED state as soon as it gets the final ACK.
In that case, it is the client which is waiting for two minutes in the
TIMED_WAIT state. This will prevent the client reusing the same port number,
because it hasn't been fully closed yet.
If the server and client close the connection at
the same time (their FINs cross in transit) then they will both be in the
TIMED_WAIT state for two minutes after both ACKs have been sent and
received. The SO_LINGER socket option is typically used to
modify the close sequence so that it will do an "abortive close" by sending a
RST instead of FIN. This can result in loss of data which hasn't been sent or
acknowledged yet. It also doesn't seem to be a complete solution: if the client
uses SO_LINGER and sends RST but the server doesn't receive it, then the client
still won't be able to open a new connection using the same local port number,
because the server won't have fully closed the previous connection. If the
server uses SO_LINGER then the client might miss the RST and not know
the connection was closed.
Is there a particular reason that you need to use
the same port number for re-establishing the connection? It is far easier if you
just wait two minutes before reconnecting, or use a different local port number
for each connection.
----- Original Message -----
|
[Prev in Thread] | Current Thread | [Next in Thread] |