[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] TCP retransmission flooding at end of stream

From: Michael Steinecke
Subject: Re: [lwip-users] TCP retransmission flooding at end of stream
Date: Thu, 18 Sep 2014 02:55:34 -0700 (MST)

Sergio R. Caprile wrote
>>>/ The FW Library bug in the Ethernet IRQ, eating fast packets is fixed./
>>So no, this does not seem to be the standard STM issue...
> Oh, I see, missed that part. Should we believe the vendor ? (terrified
> face)

I think there is another related bug as well. The semaphore to signal new
packets is a binary one. It should be a counting one. I had it at least
once, that there was a full semaphore due to really fast packages and the
latter one got eventually lost.
ethernetif.c Line 280 in low_level_init():
  s_xSemaphore = osSemaphoreCreate(osSemaphore(SEM) , 1 );
should be:
  s_xSemaphore = xSemaphoreCreateCounting(ETH_RXBUFNB, 0);

Sergio R. Caprile wrote
> - Frame 16: bad FCS on ARP response from MCU to PC, why ? 

Good one! I had an old version of Wireshark, not reporting this one.
Currently I'm doing a bit of research and it seems that I have a erroneous
calculation of the total length of the pbuf in my RX driver. Later on, the
packet is reused by etherarp for the response. I guess this could lead to
the wrong CRC value. On the other hand, the CRC is calculated by HW, this
should always be correct, right? What happens when LwIP or the ETH detects a
corrupt CRC on RX? the packet is lost or dropped?

Sergio R. Caprile wrote
> - Your DHCP on UDP port 55555, turn it off, just in case, you don't seem
> to be using it

The UDP 55555 broadcast is a discovery broadcast of our application protocol
and used. Turning of DHCP made no difference.

Sergio R. Caprile wrote
> - Frame 2094: Yes, 2058's ACK has been seen, but 2057's not. Then, Seq#s
> jump at sometimes more than 1460, so some frames were lost, some not.
> - Frame 2162(3,5) ARP request is not seen by lwIP, frame loss
>       You are definitely having an event that triggers frame losses. Where is
> it, I can tell.
>       You said this is a custom board, I had once something like this where my
> driver went out of sync with the eth chip by incorrectly reading available
> bytes.
>               Please run known to work code first, this looks to me like an 
> eth driver
> problem

I will have a closer look and eventually switch back to the original driver. 

Sergio R. Caprile wrote
> - You say you are using tcp_poll() to enqueue data. Don't do that if you
> aim for performance, that is just to avoid state machines on connection
> closures and some other good stuff, not for streaming data.
>       You should start sending your data from your tcp_recv() parsing the
> request and then keep steady sending from your tcp_sent()

So, how is the best way to handle the following scenarios?
The device is a measurement device, always acquisitioning data from external
ADCs at 5-20 kHz, normally 18 bytes per sample. It is transfered by SPI and
DMA to a cache in a FMC connected SD-RAM. The pointer handling must be done
at highest priority IRQ level, even higher then ETH.

a) The Client (PC) requests a large file from the SD card. Currently, I pass
the request from tcp_recv() to the SD Card gatekeeper task, which reads the
data page per page (64k). As soon as one page has been read, it queues a
structure with some commands and control variables and as well as the
pointer to the memory. That command from the queue is read by tcp_poll() and
the data written using tcp_write(). Eventually, the data is freed in
tcp_sent(). Also, if there was a full sndbuffer, tcp_sent() would enqueue
the missing data. Between the initial request from the client and the end of
transfer, I don't get any new client-requests. 

b) Nearly the same scenario, but the data is not from the SD-Card but
directly from the SD-RAM of the current acquisition. As soon as there is a
big block of data available, a command like in case a) is posted in the same
queue. Scenario b) can occur by Client request or due to an external trigger

The responsible tasks are running on a lower priority as the tcp-thread.

Sergio R. Caprile wrote
>> According to lwip_stats there is no memory leak and no packet drop
> Well, lwIP can only count *packet* drops, not *frame* loss.
> And memory leak is tricky, is it possible you are freeing a wrong pointer
> or in the wrong place ? Try sending and freeing at the same place, that is
> tcp_sent(), let tcp_poll() aside for now.
> Check the web server or smtp client sending functions.

Currently I don't believe there is an memory issue. The behavior is
independent from the amount of used data. However - for some reason there
lost frames, I agree. I'll have a closer look on the drivers again, then go
back to the examples.

Thanks for you input!

View this message in context: 
Sent from the lwip-users mailing list archive at Nabble.com.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]