lwip-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-devel] byte order, packing, optimizations


From: address@hidden
Subject: Re: [lwip-devel] byte order, packing, optimizations
Date: Mon, 15 Feb 2010 20:17:09 +0100
User-agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; de; rv:1.9.1.7) Gecko/20100111 Thunderbird/3.0.1

Bill Auerbach wrote:
I think (hope) we're in agreement that struct padding of 2 should be the
default (over 1) since there is benefit in doing so and there is no need to
ever pack by 1.
All the patches so far have the downside (to me) that they effectively require you to have incoming data aligned in a specific way. Up to now, we solved this the other way round by telling the compiler to load data from unaligned addresses.

I do think that pack(2) and asserting for packets to be aligned is good regarding speed, but worse regarding flexibility and portability of our code, as well as harder to track down bugs in the future (if we don't include appropriate checks for aligment).

I'm happy with adding defines to let stack work faster if you *know* your incoming packets are aligned, but I'm not happy with optimizing the actual source code for this. There are platforms which benefit from this while others don't. A good example is Stephane's proposed change from htonX + shift to accessing as u8_t pointer + offset: if your htonX-function is fast (e.g. a custom OP on the NIOS), the old code may be faster than loading 4 bytes (which can really be 4 loads when loading from a struct pointer).

All I'm saying here is that I'd like these performance-improvements to be tested on more platforms and making them overridable - this also implies that we can't have too many of them or anyone wanting to optimize fails to get an overview.
A comment might be nice stating padding of 2 or 4 is only
necessary if the processor has 16 or 32-bit alignment requirements.
Padding? Where?
The patch you attached does far more than that! And it's not clear to me
if/why we need all that. Also, it interferes with / task #10173
<http://savannah.nongnu.org/task/?10173>/ (Add IPCOPY and HWACOPY):
http://savannah.nongnu.org/task/?10173
I wondered if his patch in fact covers 10173.  And where does it leave us
with IPv6?
Personally, (and this is also a summary to the lines above) I prefer the macro patch over hard-coded u8_t-array access: if you know the packets will be aligned, this is faster, if you don't know what's best for your platform, the default version still "just works" (even if it is a bit faster than it could be). After all, there are people who don't want to spend time on increasing throughput!

Simon




reply via email to

[Prev in Thread] Current Thread [Next in Thread]