[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[lwip-devel] Issue with Cortex Emac driver
From: |
RAc |
Subject: |
[lwip-devel] Issue with Cortex Emac driver |
Date: |
Thu, 5 Nov 2015 06:51:15 -0700 (MST) |
Hi everybody,
I have come across this issue several times now and believe it may be of
interest to some of you; if not or if this should be a dupe, sorry for the
noise...
When running lwip on top of an RTOS, you may experience that under heavy
stress load (eg several concurrent instances of fast pings running several
minutes), your network response may deteriorate, with your sniffer revealing
that no packet is lost but some are delayed significantly. FOr example, if
your system is in that state and you cut down the stress test to one fast
ping, every n-1 out of n pings may time out, but the sniffer trace will
reveal that they are only processed way after the timeout expired.
As far as I can tell, this is simply a bug in the emac driver. Typically,
the control flow is something like this (pseudo code of course):
Ethernet ISR:
if (receiver has caused an interrupt) signal Rx semaphore;
Ethernet input task:
for(;;)
{
wait on Rx semaphore;
if (!(current_descriptor->Status & ETH_DMARXDESC_OWN))
{
copy descriptor contents to allocated pbuf;
current_descriptor->Status |= ETH_DMARXDESC_OWN;
forward current_descriptor to next in chain;
process packet or signal tcp thread to process the packet
asynchronously
}
}
(of course, there is more work due to possibly fragemented packets).
all of this works fine as long as the (leading) hardware descriptor pointer
and the (trailing) software descriptor pointer are in sync. However, if for
some reason (race conditions or the like) the software descriptor pointer
goes out of sync and points to a descriptor not owned by DMA, the infinite
loop will simply go to sleep and wait for the next interrupt to fire the
semaphore - and that'll only process the packet once the DMA has completly
filled up the ring buffer, making the current Position of the trailing
pointer valid. This'll cause the sawtooth pattern in which one packet will
sort of push the entire chain of outstanding buffers to be processed.
Interestingly enough, in networks with lots of traffic (eg those with lots
of ARP broadcasts) there will be no visible adverse effects as the chain of
packets will be retriggered frequently enough, eventually serving all
packets more or less in a timely fashion. It'll only Show up in well behaved
nets in which your traffic appears to "hang" until retriggered.
Even though the best solution would be to fix the race condition that caused
the ring buffer desriptor pointers to go out of sync, I found that the
following addition appears to solve the problem alright:
for(;;)
{
wait on Rx semaphore;
if (!(current_descriptor->Status & ETH_DMARXDESC_OWN))
{
copy descriptor contents to allocated pbuf;
current_descriptor->Status |= ETH_DMARXDESC_OWN;
forward current_descriptor to next in chain;
process packet or signal tcp thread to process the packet
asynchronously
}
else // NEW!!!
forward current_descriptor to next in chain until a descriptor owned
by DMA is encountered;
}
Ethernet input task:
Any input of suggestions for better fixes welcome, thanks!
--
View this message in context:
http://lwip.100.n7.nabble.com/Issue-with-Cortex-Emac-driver-tp25326.html
Sent from the lwip-devel mailing list archive at Nabble.com.
- [lwip-devel] Issue with Cortex Emac driver,
RAc <=