Date: Thu, 5 Nov 2015 06:51:15 -0700 (MST)
From: RAc <address@hidden>
To: address@hidden
Subject: [lwip-devel] Issue with Cortex Emac driver
Message-ID: <address@hidden>
Content-Type: text/plain; charset=us-ascii
Hi everybody,
I have come across this issue several times now and believe it may be
of
interest to some of you; if not or if this should be a dupe, sorry for
the
noise...
When running lwip on top of an RTOS, you may experience that under
heavy
stress load (eg several concurrent instances of fast pings running
several
minutes), your network response may deteriorate, with your sniffer
revealing
that no packet is lost but some are delayed significantly. FOr
example, if
your system is in that state and you cut down the stress test to one
fast
ping, every n-1 out of n pings may time out, but the sniffer trace
will
reveal that they are only processed way after the timeout expired.
As far as I can tell, this is simply a bug in the emac driver.
Typically,
the control flow is something like this (pseudo code of course):
Ethernet ISR:
if (receiver has caused an interrupt) signal Rx semaphore;
Ethernet input task:
for(;;)
{
wait on Rx semaphore;
if (!(current_descriptor->Status & ETH_DMARXDESC_OWN))
{
copy descriptor contents to allocated pbuf;
current_descriptor->Status |= ETH_DMARXDESC_OWN;
forward current_descriptor to next in chain;
process packet or signal tcp thread to process the packet
asynchronously
}
}
(of course, there is more work due to possibly fragemented packets).
all of this works fine as long as the (leading) hardware descriptor
pointer
and the (trailing) software descriptor pointer are in sync. However,
if for
some reason (race conditions or the like) the software descriptor
pointer
goes out of sync and points to a descriptor not owned by DMA, the
infinite
loop will simply go to sleep and wait for the next interrupt to fire
the
semaphore - and that'll only process the packet once the DMA has
completly
filled up the ring buffer, making the current Position of the trailing
pointer valid. This'll cause the sawtooth pattern in which one packet
will
sort of push the entire chain of outstanding buffers to be processed.
Interestingly enough, in networks with lots of traffic (eg those with
lots
of ARP broadcasts) there will be no visible adverse effects as the
chain of
packets will be retriggered frequently enough, eventually serving all
packets more or less in a timely fashion. It'll only Show up in well
behaved
nets in which your traffic appears to "hang" until retriggered.
Even though the best solution would be to fix the race condition that
caused
the ring buffer desriptor pointers to go out of sync, I found that the
following addition appears to solve the problem alright:
for(;;)
{
wait on Rx semaphore;
if (!(current_descriptor->Status & ETH_DMARXDESC_OWN))
{
copy descriptor contents to allocated pbuf;
current_descriptor->Status |= ETH_DMARXDESC_OWN;
forward current_descriptor to next in chain;
process packet or signal tcp thread to process the packet
asynchronously
}
else // NEW!!!
forward current_descriptor to next in chain until a descriptor
owned
by DMA is encountered;
}
Ethernet input task:
Any input of suggestions for better fixes welcome, thanks!