[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [lwip-devel] Curious struct packing issue - is it GCC?
From: |
Bill Auerbach |
Subject: |
RE: [lwip-devel] Curious struct packing issue - is it GCC? |
Date: |
Mon, 27 Apr 2009 16:21:46 -0400 |
>Shouldn't the compiler do that inlining job for you? Isn't there an
>option for gcc for that?
I didn't check because it will do a byte-by-byte copy and using u16_t's will
be half the code and memory cycles.
>> I thought I would copy u16_t in half the copies since either
>> everything is u32_t aligned for my processor (NIOS II - GCC as
>> mentioned) or **should** be u16_t aligned for IP related items.
>>
>
>Why that? For which reason should stack-internal data be 16-bit aligned?
Things in headers will be u16_t aligned. The once place we SMEMCPY to a
header is the netif mac address which wasn't aligned.
>The struct netif is an lwIP-internal struct and thus does not need to be
>packed. The only structs that need packing are those used in protocol
>headers/data, since their layout is determined by the network/protocol.
>
>As such, I don't think you can rely on all data being u16_t aligned to
>achieve good performance: every struct member of an non-packed struct
>should be aligned as it is required - thus, a char array can always be
>non aligned! After all, you still have the alignment-problem if setting
>LWIP_NETIF_HOSTNAME to 1.
Thank you! That was my ignorance - I didn't realize a char array would
always be byte aligned. Makes sense. If it's u16_t or u32_t aligned, then
a more efficient inlined SMEMCPY can be used. Shouldn't we do what we can
so we're not using memcpy to copy 4, 6,or 14 bytes? Especially as often as
it's done sending packets?
>I'm not sure I understand you here: The packed struct members you are
>talking about are the IP-addresses, right? Is there anything unaligned
>after the 3 addresses? If so, that would be a bug, yes.
No, we're good. The netif mac address appeared packed. You got it - it
need not be since it's a char type. My bad.
>Did I understand you correctly that you want to remove the u8_t to get
>hwaddr aligned? That might work now, but I don't think every developer
>changing that struct will remember such a subtle alignment issue...
It would be nice if the hwaddr could be aligned (u16_t) if one wanted the
inlined SMEMCPY moving u16_t's.
>Aside from that, we explicitly *don't* use ETHERP_HWADDR_LEN throughout
>the stack, only in places that are ethernet-related (etharp.c,
>ethernetif.c, example port ethernet drivers). This is necessary to use
>the stack in non-ethernet environments!
Sorry - I didn't know that other environments had different sized hwaddr's.
It would be shorter? Or NETIF_MAX_HWADDR_LEN has to be changed to
accommodate it. Seems to me that hwaddr_len and a u8_t *hwaddr (pointer to)
should be filled in by the low level init code. Code which already stores a
MAC address can supply a pointer to it and it can be any size without a
#define to set the maximum size. I read (as many do) an EEPROM of the MAC
address into a RAM buffer which I also have to copy into the netif.
Wouldn't it be slightly better to store the pointer to the hwaddr?
>Hm, I guess that's up to the compiler: But since it doesn't know the
>alignemnt of src and dst, it probably won't... Is there any disadvantage
>in keeping the SMEMCPY?
If you want to inline as I did for the shorter copies. I'll look at the GCC
option. The memcpy call/return overhead far exceeds the time for 4-byte and
we copy IP addresses often.
>To sum it up, I think the idea of that copy define is quite handy, but
>it should by all means be optional and keep working even if some
>developers change structs to bad alignment.
I completely agree. I worked within the confines of SMEMCPY so others can
benefit from performance increases that don't add a ton of code. But that
SMEMCPY can inline only the small ones and leave a memcpy in place for the
28 byte copy, unless someone wants 100+ inline bytes of code to inline the
word copy.
Bill