[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] FTP-DATA exchange: TCP issues

From: Jim Gibbons
Subject: Re: [lwip-users] FTP-DATA exchange: TCP issues
Date: Fri, 04 Mar 2005 10:50:34 -0800
User-agent: Mozilla Thunderbird 1.0 (Windows/20041206)

Pardon me again for barging in.  Keiran's analysis, particularly regarding an unmotivated retransmit, sounded very familiar.  I had a problem like this at one of my clients.  We changed two things and it then went away.

First, we found and fixed a problem with the tcp_tmr.  It was running in the wrong task context.  It must run in the tcpip thread.  The usual method for doing this is to make the initial call to sys_timeout from within the callback function that executes when tcpip initialization is done.

Second, we found that we weren't using the lightweight protection option that I mentioned to you earlier.

I think it was actually the first thing that was causing the retransmit problem, but we never found out for sure.  It's really difficult to track down resource conflicts.  When the problem went away, we stopped working on it.

Tom C. Barker wrote:
Thanks for your analysis Kieran. Forgive my assessment of 
what ACKs are what: I was speaking of the multiple ACKs 
the client sends back. ".65", the problem node, is in fact 
the lwIP ftp server.

I have all my DEBUG statements on and find that I never get
a tcp_enqueue of the missing packet. It just skips over it.
My only priority is this issue right now so if you or anyone
has any ideas of what I can watch for I open to ideas. Meanwhile
I'm crafting a bit-patterned file to help identify where the 
problem is occurring.


-----Original Message-----
From: address@hidden
[mailto:address@hidden]On Behalf
Of Kieran Mansley
Sent: Friday, March 04, 2005 1:29 AM
To: Mailing list for lwIP users
Subject: Re: [lwip-users] FTP-DATA exchange: TCP issues

On Thu, 2005-03-03 at 09:54 -0800, Tom C. Barker wrote:

Maybe to short-circuit this issue, I am working with 
0.7.2 and am in the process of moving to 1.1.0 so if 
the following problem resembles a bug prior to 1.1.0,
please let me know.

In testing an ftp implementation where I will occasionally 
successfully transfer a 400k file, I have come across a
consistently reproducible issue where my lwIP ftp server 
seems to have dropped an ACK in that according to the 
attached (truncated-packets) ethereal file, the packet on 
line 249 should have ACK'd 264364, but instead ACKs 267284. 
The rest of the (doomed) transaction is spent trying to 
shoehorn in a few packets to the client's unacked queue. 

Your description doesn't seem to match the trace that you've attached.
There is no packet there that ACKs 267284.  

However, there is clearly something going wrong in that data transfer.
The problem seems to me to start with packet 245, which (i) is a
retransmission (of packet 242) when none seems necessary and (ii)
doesn't have the same payload as the earlier transmission of the same
data.  Looks to me like packet 245 has got the wrong sequence number on
it, and it is in fact the payload of the next in-order packet.

Something similar happens with packet 244 and 247: 247 is a
retransmission of 244, but would not seem to be necessary, and this time
they both have the same payload.

What's more worrying is that the ".65" node then fails to retransmit the
correct data when it should: it gets many duplicate acknowledgements for
264364, which should lead it to retransmit that packet, but it refuses.

I can't explain this is in full, but hopefully that will give you some
clues about what might be wrong.  You could compare the captured
payloads against the file that is being transferred to check my theory
about 245 having the wrong sequence number.


lwip-users mailing list

lwip-users mailing list

Jim Gibbons
Gibbons and Associates, Inc.
TEL: (408) 984-1441
900 Lafayette, Suite 704, Santa Clara, CA
FAX: (408) 247-6395

reply via email to

[Prev in Thread] Current Thread [Next in Thread]