[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [lwip-devel] What is the correct behavior for missing the reception
Re: [lwip-devel] What is the correct behavior for missing the reception of an ACK
Mon, 24 May 2021 15:43:18 +0000
Thanks for replying - especially so quickly!
>>Cool to hear you're still working with lwIP! You might be one of the few left
>>here from even before my start with lwIP :-)
Yeah, old timer here... I check in every once and a while and see you're still
going strong answering posts - a lot I might add.
>>That sounds a bit strange: segments should not be held back here: TCP has the
>>sliding window of segments in flight.
>>Segments should only be held back if there's no space in the send window.
>>However, sending 1 segment per second and
>>aborting if there is no update in 2 seconds seems a bit hard: if you're
>>losing 2 ACKs like you lost the first, you're lost.
It's surely on my end. Is it true that a missed ACK - with no data - isn't
really a problem because the next ACK has a larger
sequence number and it is concluded that the higher ack means the missed lower
ACK's data had to arrive?
>>Anyway, can you provide a pcap trace of such a situation? It would be best if
>>it included the connection setup phase
>>as well (but not necessarily if it's too far apart).
I stopped capturing because from the PC side, the capture all looks good
because there is no error on sending an ACK
which is missed. I do see the stall. Sometimes there are DUPs sent from lwIP,
but mostly the stream continued as if nothing
went wrong. I'll start capturing again. I'm now running 4 and 5 devices at my
desk since it helps it occur faster.
When I get this resolved - if possible - I think it's hardware - I'll set up a
test where an ACK will be force dropped and see
if the behavior continues. Then I can then say "If you do this you'll see the
stall". By the way, we'll never use SOPC and NIOS
again. In fact, we never invested in moving to QSYS which maybe is why we're
now seeing problems. Old hardware and
old code. But another NIOS device with the same chip, driver and lwIP never
shows this problem.
>>Having pcaps of win7 and win10 to see what's different would be interesting
>>as well, but I don't think that's really the
My guess is the timing changes and is breaking something in our code. We do
push the limit on the Cyclone III. We have
lots of code in onchip memory and we're using a hardware checksum (in the FPGA).
>>I'd probably move to more than 2 seconds anyway after we found the reason for
>>this, but (depending on your lwIP
>>configuration), it should probably still work, so I'd debug this first.
I'll continue - I don't think it's lwIP, and if it is, it's because of dropped
packets which shouldn't be occurring anyway.
>>What are your TCP-specific settings? Do you have any transmit queue or
>>transmit window limitations that would make the
>>stack wait for this single ACK before allowing to enqueue or send more data?
No limits. TCP_WND is 65534, TCP_MSS 1460, TCP_SND_BUF 32768 - remainder are
the defaults. Lots of PBUF_RAW (3000 RX & 3000 TX).
Thank you Simon!