lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] netconn_write blocking


From: Frédéric BERNON
Subject: Re: [lwip-users] netconn_write blocking
Date: Tue, 9 Oct 2007 22:44:09 +0200

Hi,

When you said "lots of messages about the queue being full", can you tell me what exact messages you got?

I think if you wait a long long time (several minutes), you should got a error. In fact, there is a kind of "timeout" in lwIP for tcp "write", but default values in opt.h and tcp.h are too big (to my point of view). I think mainly to TCP_SYNMAXRTX that you could reduce. This is the number of retransmissions you have to wait before lwIP abort a TCP connection when it's segments are not acknowledged: when you unplug your "peer", your server continue to send packets until it fill the "tcp send buffer". Even in this case, your "write" doesn't return if the segment is not "enqueued" (it retry each time tcp_sent callback is invoked, with do_writemore). Since the cable is unplugged, the tcp segments you send are never acknowledged. So, the "slow" tcp timer try to resend them (tcp considers that these segments can be lost in the network, so, this is a normal tcp retransmission). It try to resend them TCP_SYNMAXRTX times (but not in a "linear" way, but in a "exponential" way). After that, it abort the connection. If you can do a wireshark capture, I suppose you can see these retransmissions (that what I did, see below). So, the "solution" is to reach TCP_SYNMAXRTX faster. To do that, you can:

- Reduce TCP_SYNMAXRTX in your lwipopts.h (you can try 4)
- Reduce TCP_TMR_INTERVAL in your lwipopts.h (you can try 100)

We have talk with Kieran about lwIP retransmission implementation in this emails (this is not exactly the same case, but the cause is, but, be carefull, I talk about a dirty hack, don't use it, it was just for experience):

http://lists.nongnu.org/archive/html/lwip-devel/2007-09/msg00061.html
http://lists.nongnu.org/archive/html/lwip-devel/2007-09/msg00062.html
http://lists.nongnu.org/archive/html/lwip-devel/2007-09/msg00063.html

I attach some captures I did during these tests, but I can remember the TCP_TMR_INTERVAL value I used. What you can see in "TCP_MAXRTX=12.cap", is there is until 412 seconds until the connection is abort (we can see anything in the capture, lwIP dosen't send any RST packet when it abort the connection). You can also see the delay between each retransmission is increased (doubled in a first time, until it reach a max value). It use the tcp_backoff table in can found in tcp.c:

const u8_t tcp_backoff[13] ={ 1, 2, 3, 4, 5, 6, 7, 7, 7, 7, 7, 7, 7};

In "TCP_MAXRTX=6.cap", you can see the abort is reach faster.

I hope it can help you...


----- Original Message ----- From: <address@hidden>
To: "Mailing list for lwIP users" <address@hidden>
Sent: Tuesday, October 09, 2007 9:28 PM
Subject: Re: [lwip-users] netconn_write blocking



Hi,

Hi!
I have an application that is sending out TCP data to several client using the sequential API. When a client disconnects gracefully, netconn_write returns a negative value, and I can close the connection. However, if any of the clients locks up (i'm using embedded clients), or a cable gets unplugged, etc. netconn_write keeps queue packets until it fills up the buffer, and then blocks. I've been playing around with debugging, and so far all I get is lots of messages about the queue being full.
It complains about the queue being full?? That would be a misconfiguration and maybe an error in your port! The queues should never be full! That's why sys_arch_mbox_post has no return value, and the port should assert to check that a queue is never full. Misconfiguration could lead to this: too big TCP windows vs. too small queues...

But to be sure about this, could you post an excerpt of your debug output so that I know which function / file complains?
My question is what is the proper way to deal with ungraceful disconnections using the sequential API? Am I doing something wrong, should netconn_write return an error for ungraceful disconnections, or is there any other way to check if for connection timeouts?
Unfortunately, there is only RX timeout currently. TX timeout is planned, I think...


Simon


_______________________________________________
lwip-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/lwip-users

Attachment: TCP_MAXRTX=12.cap
Description: Binary data

Attachment: TCP_MAXRTX=6.cap
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]