lwip-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lwip-users] [PATCH] fix warning for gcc and possible unaligned acce


From: pedro alves
Subject: Re: [lwip-users] [PATCH] fix warning for gcc and possible unaligned access
Date: Tue, 25 Apr 2006 16:11:41 +0100
User-agent: Thunderbird 1.5.0.2 (Windows/20060308)

Hi all,

Timmy Brolin wrote:
This line:
printf ("tm.b = 0x%x\n", (unsigned long)tm.b);
Would most likely cause a crash.


No it won't. It is doing a conversion between a unsigned long with alignment of 1 to an unsigned long with natural alignment.

Look at this probably clearer example:

#include <stdio.h>

struct ip_addr
{
 unsigned long addr;
};

struct test_me
{
 unsigned char a;
 unsigned long b;
 unsigned char c;
} __attribute__ ((packed));

struct test_me2
{
 unsigned char a;
 struct ip_addr b;
 unsigned char c;
} __attribute__ ((packed));

/*
* extract unaligned field to show alignment is not a problem.
*/
struct ip_addr test(struct test_me2* tm)
{
 return tm->b;
}

/*
* extract unaligned field to show alignment is not a problem.
*/
unsigned long test2(struct test_me2* tm)
{
 return tm->b.addr;
}


int main(int argc, char** argv)
{
/*volatiles to hinder compiler optimizing loads and stores to the same variables. */
/*a char has an alignment of 1 */
 volatile char dummy_offset_alignment_for_next_var;
/*this probably gets an unaligned address */
 volatile struct test_me tm;
/*these are required to get aligned properly */
 volatile unsigned long i;
 volatile unsigned long i2;
 unsigned long tmp;

/*write to a and c first, and then to b, to show that writing to b doesn't corrupt neither a nor b*/
tm.a = 0xdd;
tm.c = 0xee;
/* write to packed field through unpacked variable */
i = 0x55ffaa33;
tm.b = i;
/* read from packed field into unpacked variable */
 i2 = tm.b;

/*just to confirm we are really packed, better would be to use offsetof */
printf("sizeof(struct test_me) = 0x%x\n", sizeof(struct test_me));

/* convert unsigned char to unsigned long, as the %x formatter expects an unsigned integer sized argument in its va_list */
tmp = tm.a;
printf ("tm.a = 0x%x\n", tmp);

/* convert packed unsigned long to unsigned long, as the %x formatter expects an ALIGNED unsigned integer sized argument in its va_list */
tmp = tm.b;
printf ("tm.b = 0x%x\n", tmp);

/* convert unsigned char to unsigned long, as the %x formatter expects an unsigned integer sized argument in its va_list */
tmp = tm.c;
printf ("tm.c = 0x%x\n", tmp);

printf ("i2 = 0x%x\n", i2);

return 0;
}



The compiler knows the types are the same, but have different alignments and copies them correctly.

The problem with alignments and casts show up when you do type punning like:

struct test_me tm;
unsigned long tmp = *(unsigned long*)&tm.b;

"Get address of member b, cast that address to unsigned long* type and then get the value pointed by it."

The problem here is that (unsigned long*) means a pointer to a unsigned long with natural alignment, which has in most architectures an alignment requirement > 1.
We lose packing information in this case.

But this works:
unsigned long tmp = tm.b;
because the compiler knows that b has an alignment of 1.

What else would packed structs be useful for, if you couldn't read and write to their unaligned members with operator= syntax?


To show how the compiler treats packed fields, I compiled this test with gcc-4.0 for h8s target with:
h8300-elf-gcc test_align.c -g -c -O3 -fomit-frame-pointer -o test_align_h8.o
and then disassembled it with:
h8300-elf-objdump -DS test_align_h8.o > test_align_h8.s

Here is the disassembling  of the test2 function:

unsigned long test2(struct test_me2* tm)
{
 18:   0d 02           mov.w   r0,r2
 1a:   6e 01 00 03     mov.b   @(0x3:16,r0),r1h
 1e:   6e 09 00 04     mov.b   @(0x4:16,r0),r1l
 22:   6e 20 00 01     mov.b   @(0x1:16,r2),r0h
 26:   6e 28 00 02     mov.b   @(0x2:16,r2),r0l
 return tm->b.addr;
}
2a: 54 70 rts Here is a the disassembling of the test function. Slightly more complicated, because the return address is a struct. Those are written to the address pointed to by er0, a hidden parameter. Parameter 0 is er1 in this case.

00000000 <_test>:
0: 6e 13 00 01 mov.b @(0x1:16,er1),r3h 4: 6e 1b 00 02 mov.b @(0x2:16,er1),r3l
  8:    0d 3b           mov.w    r3,e3
  a:    19 33           sub.w    r3,r3
  c:    6e 1a 00 03     mov.b    @(0x3:16,er1),r2l
 10:    14 a3           or.b    r2l,r3h
 12:    1a a2           sub.l    er2,er2
 14:    6e 1a 00 04     mov.b    @(0x4:16,er1),r2l
 18:    01 f0 64 32     or.l    er3,er2
 1c:    01 00 69 82     mov.l    er2,@er0
20: 54 70 rts
Should be easy to understand even for someone that never seen h8 assembly.
h8s has 32bit registers that are byte/word/long loadable/storable. ex: er0 -> e0+r0 -> r0h + r0l. It does NOT support unaligned accesses. Doing an unaligned access in this machine simply reads/writes to/from the wrong address. No exception is raised. Notice that the compiler reads the packed member a byte at a time starting at offset 0x1, and constructs a 32 value in er2, and finally returns it in
er0, the return register in h8s ABI.

Again,
If the packing is removed from struct ip_addr, struct ip_addr2 is removed, and hdr is an instance of a packed struct,
then there is no need to memcpy, the compiler does the right thing.

A simple:
sipaddr = hdr->sipaddr;
will do.

Cheers,
Pedro Alves

The problem is that if the address of tm.a is for example 0x1000, then the address of tm.b will be 0x1001. If you try to perform a 32bit read or write operation on tm.b then it will crash if the CPU does not support unaligned accesses.

If you leave out the packed attribute, then the compiler will insert three dummy bytes between tm.a and tm.b to solve the unaligned access problem. This is of course not an option in the lwip case since we use structs to describe packet headers.

Regards,
Timmy Brolin

Pedro Alves wrote:

Timmy Brolin wrote:

Curt McDowell wrote:

Pedro Alves wrote
Seems like the right thing to do.
If the packing is removed from struct ip_addr, and struct ip_addr2 is removed, then there is no need to memcpy anymore.

A simple:

sipaddr = hdr->sipaddr;

will do.


memcpy may still be needed because hdr->sipaddr is misaligned at +2 due to the format of the ARP packet. My theory is that struct ip_addr2 was originally invented because the 4-byte struct/uint32 copy was crashing on someone's 2-byte
aligned CPU!

Regards,
Curt McDowell
Broadcom Corp.

Either ip_addr2 or memcpy is needed.
The code "sipaddr = hdr->sipaddr;" would crash on any CPU which do not support unaligned accesses.


huh? Not if struct ip_addr is NOT packed, hdr is an instance of a packed struct, and sipaddr is a struct ip_addr, not struct ip_addr2.

Are you saying that:


#include <stdio.h>

struct test_me
{
  unsigned char a;
  unsigned long b;
  unsigned char c;
} __attribute__ ((packed));

int main(int argc, char** argv)
{
 volatile char dummy_offset_alignment_for_next_var;
 volatile struct test_me tm;
 volatile unsigned long i;
 volatile unsigned long i2;

 tm.a = 0xdd;
 tm.c = 0xee;
 i = 0x55ffaa33;
 tm.b = i;
 i2 = tm.b;

 printf("sizeof(struct test_me) = 0x%x\n", sizeof(struct test_me));
 printf ("tm.a = 0x%x\n", (unsigned long)tm.a);
 printf ("tm.b = 0x%x\n", (unsigned long)tm.b);
 printf ("tm.c = 0x%x\n", (unsigned long)tm.c);
 printf ("i2 = 0x%x\n", i2);

 return 0;
}

Won't work as expected? That would beat the whole purpose of __attribute__((packed)), no?

P.S.: No use testing in x86, as that supports unaligned accesses anyway.

Cheers,
Pedro Alves

The current ip_addr2 solution is technically more efficient than memcpy since it copies the ip_addr with just two 16bit reads+writes. The only solution which would be slightly more efficient is to copy hdr->sipaddr to a 32bit register using two 16bit accesses, and then write the 32bit register to sipaddr. But since this is not a critical code path, it is more important to keep the code as clean as possible rather than trying to save one single memory access.
The ip_addr2 solution makes the code nice and clean.

Regards,
Timmy Brolin





reply via email to

[Prev in Thread] Current Thread [Next in Thread]