[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-devel] IP fragmentation

From: Thomas Taranowski
Subject: Re: [lwip-devel] IP fragmentation
Date: Thu, 10 May 2007 15:22:41 -0700

In general, I always try to design my software such that it respects the MTU of whatever network it's going to be operating on.  However, an application I want to support needs to receive 3k UDP datagrams from a remote target on a standard Ethernet network.  I think this is a perfectly reasonable and normal operation for lwIP to support, and is well within the bounds of the UDP protocol, which allows datagrams up to 65k in size (although very few stacks support a limit this high).

I like the idea of temporarily storing the fragments in their respective pbuf chains.  It makes sense to handle the incoming payloads as little as possible, so just keeping the fragments where they are rather than memcpying them into a reassembly buffer seems like a good idea.

The only thing that bothers me about this solution is that there can be a sizeable chunk of pbufs floating around in limbo waiting for the rest of the fragments to arrive.  This means if a large number of fragments are incomplete at any given moment, say from a very unreliable link or greatly mismatched targets, then the stack could exhaust it's pool of pbufs, and be deadlocked.  I would support a limit on the number of pbufs that can be used for fragmented packets at any one time.  This way, there will always be pbufs in reserve to receive new packets.  Then again, maybe the reassembly timer is good enough to protect against this scenario.  Every 3 seconds, or whatever it is set to, it flushes the stale fragments, releasing the previously bound pbufs.

From reading the above, it sounds like we're trying to decide whether to go with a malloc/memcpy approach to efficiently allocate heap space for fragments as they arrive, or to use the pbufs, which don't require a memcpy, but aren't necessarily as efficient, space-wise.  In some respects it depends on the implementation of the malloc.  Many embedded OSes use a pool based allocation scheme to handle fragmentation issues, while most general purpose OSes use whatever suits their fancy.  This argues that we need to have some sort of configuration item to select between the two.  However, I would seem to me that the implementation of the malloc/memcpy approach and the pbuf approach would be largely divergent, leaving fairly large tracts of #ifdef'd code, which is something I'm not too keen on.   Maybe there is some sort of unified approach that could be used to cleanly unite the two implementations, and allow for the advantages of both.  If so, I opt for that approach.  If not, then I much prefer the pbuf approach.

On 5/9/07, Jonathan Larmour <address@hidden> wrote:
Goldschmidt Simon wrote:
>> The main problem I see, as mentioned in that task, is you
>> have to use up a whole PBUF_POOL_BUFSIZE (or even multiple)
>> worth of bytes from the pbuf pool.
>> Suppose PBUF_POOL_BUFSIZE is 256, and there is an MTU
>> somewhere between this host and the remote peer of 576 - not
>> uncommon given that's the internet standard. Then you may
>> waste 192 bytes per frag. Storing a 32K (for example) PDU
>> would take 57 pbufs, with about 11KiB wasted (ignoring pbuf
>> structure overheads).
> That is a general pool vs. heap issue. My proposal at least lets
> you use the pbuf pool without dead-locking the stack. to use it
> or not (e.g. 'wasting' the memory for it or not) would be the
> Decision of the developer (and I chose to have mem_malloc() using
> pools, which also wastes a lot of RAM). The point is I would like
> to be able to chose. That way you can configure lwIP running slow
> on small targets or running fast on bigger targets (a marketing
> guy would probably say it scales?).

Personally I don't have trouble with things as config options.

>> Maybe we could copy them into PBUF_RAMs, rather than having a
>> separate buffer. I'm undecided whether that would be worth it.
> That would make the ip_frag code a little cleaner, but you could
> instead pass pbuf_refs pointing into ip_reassbuf to ip_input, so
> either way you would end up copying only once (from the received
> PBUF_POOL to ip_reassbuf or to PBUF_RAM), which is better as the
> current situation.

You still have to have the dedicated buffer with that approach. At least
memory for PBUF_RAMs can be shared with other parts of the system. There is
the overhead of the pbuf structure admittedly - probably 16 bytes +
possible alignment bytes. Hopefully that is small compared to the fragment
size which should be at a minimum 576 bytes. Arguably it could be a
footprint increase for some people, but I'm not sure it's enough to be
concerned about.

> At least you would allow easy switching between single-copy using
> PBUF_RAM and zero-copy using PBUF_POOL, which I think would be
> a good compromise.

If the config option can be cleanly done, then feel free :).

[path mtu discovery]
> Agree again. Do you implment it? ;-)

Heh. I wasn't proposing anyone implementing it right now (unless they want
to :-)). It was just a throwaway comment.

eCosCentric Limited      http://www.eCosCentric.com/     The eCos experts
Barnwell House, Barnwell Drive, Cambridge, UK.       Tel: +44 1223 245571
Registered in England and Wales: Reg No 4422071.
------["The best things in life aren't things."]------      Opinions==mine

lwip-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]