[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-users] netconn_write blocking
From: |
Frédéric BERNON |
Subject: |
Re: [lwip-users] netconn_write blocking |
Date: |
Tue, 9 Oct 2007 22:44:09 +0200 |
Hi,
When you said "lots of messages about the queue being full", can you tell me
what exact messages you got?
I think if you wait a long long time (several minutes), you should got a
error. In fact, there is a kind of "timeout" in lwIP for tcp "write", but
default values in opt.h and tcp.h are too big (to my point of view). I think
mainly to TCP_SYNMAXRTX that you could reduce. This is the number of
retransmissions you have to wait before lwIP abort a TCP connection when
it's segments are not acknowledged: when you unplug your "peer", your server
continue to send packets until it fill the "tcp send buffer". Even in this
case, your "write" doesn't return if the segment is not "enqueued" (it retry
each time tcp_sent callback is invoked, with do_writemore). Since the cable
is unplugged, the tcp segments you send are never acknowledged. So, the
"slow" tcp timer try to resend them (tcp considers that these segments can
be lost in the network, so, this is a normal tcp retransmission). It try to
resend them TCP_SYNMAXRTX times (but not in a "linear" way, but in a
"exponential" way). After that, it abort the connection. If you can do a
wireshark capture, I suppose you can see these retransmissions (that what I
did, see below). So, the "solution" is to reach TCP_SYNMAXRTX faster. To do
that, you can:
- Reduce TCP_SYNMAXRTX in your lwipopts.h (you can try 4)
- Reduce TCP_TMR_INTERVAL in your lwipopts.h (you can try 100)
We have talk with Kieran about lwIP retransmission implementation in this
emails (this is not exactly the same case, but the cause is, but, be
carefull, I talk about a dirty hack, don't use it, it was just for
experience):
http://lists.nongnu.org/archive/html/lwip-devel/2007-09/msg00061.html
http://lists.nongnu.org/archive/html/lwip-devel/2007-09/msg00062.html
http://lists.nongnu.org/archive/html/lwip-devel/2007-09/msg00063.html
I attach some captures I did during these tests, but I can remember the
TCP_TMR_INTERVAL value I used. What you can see in "TCP_MAXRTX=12.cap", is
there is until 412 seconds until the connection is abort (we can see
anything in the capture, lwIP dosen't send any RST packet when it abort the
connection). You can also see the delay between each retransmission is
increased (doubled in a first time, until it reach a max value). It use the
tcp_backoff table in can found in tcp.c:
const u8_t tcp_backoff[13] ={ 1, 2, 3, 4, 5, 6, 7, 7, 7, 7, 7, 7, 7};
In "TCP_MAXRTX=6.cap", you can see the abort is reach faster.
I hope it can help you...
----- Original Message -----
From: <address@hidden>
To: "Mailing list for lwIP users" <address@hidden>
Sent: Tuesday, October 09, 2007 9:28 PM
Subject: Re: [lwip-users] netconn_write blocking
Hi,
Hi!
I have an application that is sending out TCP data to several client
using the sequential API. When a client disconnects gracefully,
netconn_write returns a negative value, and I can close the connection.
However, if any of the clients locks up (i'm using embedded clients), or
a cable gets unplugged, etc. netconn_write keeps queue packets until it
fills up the buffer, and then blocks. I've been playing around with
debugging, and so far all I get is lots of messages about the queue being
full.
It complains about the queue being full?? That would be a misconfiguration
and maybe an error in your port! The queues should never be full! That's
why sys_arch_mbox_post has no return value, and the port should assert to
check that a queue is never full. Misconfiguration could lead to this: too
big TCP windows vs. too small queues...
But to be sure about this, could you post an excerpt of your debug output
so that I know which function / file complains?
My question is what is the proper way to deal with ungraceful
disconnections using the sequential API? Am I doing something wrong,
should netconn_write return an error for ungraceful disconnections, or is
there any other way to check if for connection timeouts?
Unfortunately, there is only RX timeout currently. TX timeout is planned,
I think...
Simon
_______________________________________________
lwip-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/lwip-users
TCP_MAXRTX=12.cap
Description: Binary data
TCP_MAXRTX=6.cap
Description: Binary data