avr-gcc-list
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] Linker script patch with __flashN size checking. [Was


From: Georg-Johann Lay
Subject: Re: [avr-gcc-list] Linker script patch with __flashN size checking. [Was: Handling __flash1 and .trampolines]
Date: Sun, 16 Dec 2012 15:03:38 +0100
User-agent: Thunderbird 2.0.0.24 (Windows/20100228)

Erik Christiansen schrieb:
On 14.12.12 23:00, Georg-Johann Lay wrote:
The 64-bit ranges come from the ELPM instruction that takes the
_______^^^^^^

16-bit or 64KiB, of course

16-bits of Z-reg and concatenates RAMPZ as bits 16..23 to get a
24-bit address.

The common computer terms "paged" and "page aligned" seem to fit
perfectly here. AIUI, the Z-reg is a page register, providing base 2^16
page addresses, and RAMPZ provides an address within the page.
It is "page aligned", because any RAMPZ == 0 will always be 2^16 aligned
in flash.

The new linker script observes that constraint by starting each __flashN
aligned to 0x10000 page boundaries, i.e. on 0xN0000.

Attached you find my latest draft patch. Attached because it is easy to mess up a patch by copying a line too much or too less. Even if that line is empty or you miss one blank, patch will barf.

== Problems it solves ==

- .progmemN is located correctly or will raise an assertion. The assertion will only be thrown if .progmemN is non-empty.

- Sanity checking for: .lowtext, .trampolines, .jumptables

- There is .progmemx at the end of .text where __memx data will go.

- It seems to work with, say, -Ttext=0x20000 which is not uncommon with bootloaders.


== Caveats, new problems and problems it does not solve ==

- It is not possible to bump the location only if there is a need to do so, i.e. if .progmemN is non-empty. This wastes memory and blows the unconditionally blows binaries as it bumps data and text orphans to 0x30000 or higher, cf. [1].

- The .progmemN stuff is handled within .text so that there is no need to specify -j .flashN with objcopy.

- If the case of an overflow there is just a fixed text massage without an indication on how much the sections overflow. The user must trial-and-error change the source until she finds a configuration that passes all assertions, cf. [1] again.

- There is no test for progmem. I don't know how .progmem is used. If it's common practice to use GET_FAR_ADDRESS or similar kludge with progmem, there must be no assertion or otherwise it will break existing code. I guess Jan does know?

- size will report insanely high values because the location is bumped unconditionally, i.e. even for a trivial program the size will be at least 0x30000.

- It is no more possible to (with the new flash.sx)

     -Ttext=0x20000 -DBOOT=10 -Wl,--section-start=.bootloader=0x28000

  if one wants to place an orphan.  The linker will throw:

     ld.exe: section .bootloader loaded at [00028000,00028009]
     overlaps section .text loaded at [00020000,0002ffff]

  and similar without -Ttext=

- Diagnostic messages are printed twice.


The script moves a comment up that says ctors / dtors is needed in the lower 128Ki. I don't see a requirement for that. ctors goes like that:

__do_global_ctors reads stubs locations from .ctors, i.e. .ctors contains an array of gs() stubs. __do_global_ctors handles RAMPZ correctly while it reads from .ctors (by CALLing __tablejump_elpm__ so that the ctor will RET to __do_global_ctors. __tablejump_elpm__ performs an EIJMP that targets a stub in .trampolines which JUMPs to the ctor function.

Ditto for dtors.


== Conclusion ==

All in all, the current draft script is too intrusive and has too many negative side effects on existing code.

.progmemx will work smooth and can be added without problems, same for the more restrict .progmem pattern. avr-gcc can be extended to put __memx into .progmemx.data instead of in .progmem.data which will both work with the old and a new script (but with a new script it will work better because no low precious locations are consumed).

The linker script language appears to be too weak to express what's needed and without disturbing existing code in an inappropriate way.


I extended and attached the flash.sx torture source with the following defines:

PX      : Put data into .progmemx (prior to .progmem1)
BOOT    : Put data into some "ax" orphan section (.bootloader)
CTORS   : Add some constructors


Many thanks for the "stubs" explanation. I hadn't caught on to its
relationship to trampolines.

    .trampolines is subset of [EIND * 2^17, (1+EIND) * 2^17)

   .trampolines is subset of [EIND * 2^17, (1+EIND)]   ?

No, subset of

   [EIND * 2^17, (1+EIND) * 2^17)

which is the same as

   [EIND * 2^17, (1+EIND) * 2^17 -1]

The ) at the right matches the [ at the left and means that the rightmost value is omitted from the range / interval.

Most of the stuff I know (or believe to know) comes from
reverse engineering when I tried to clean up EIND usage in avr-gcc
(PR50820) and to add some notes and caveats to the docs.

Your knowledge is directly from the coalface, then - the most
hard-earned and real kind. I'll try to make the linker scripts dance, to
give us whatever you need to make that job easier.

NACK, the "real kind" of knowledge will not come from reverse engineering, it will come from proper documentation written by the original authors.

Reverse-engineering is very error prone and will lead to misunderstandings and will trick your intuition and experience to the wrong track. "real" knowledge is by design, not be digging in the dark.

file: avr6.patch

--- avr6.x-2.23 2012-12-13 13:42:18.000000000 +1100
+++ avr6.x-new  2012-12-15 21:55:31.000000000 +1100
@@ -70,13 +70,12 @@
   .rel.plt       : { *(.rel.plt)               }
   .rela.plt      : { *(.rela.plt)              }
   /* Internal text space or external memory.  */
-  .text   :
+  .lowtext :

Why that? I don't think we suppose people to objcopy -j .lowtext or otherwise their binaries will miss essential parts.

   {
     *(.vectors)
     KEEP(*(.vectors))
     /* For data that needs to reside in the lower 64k of progmem.  */
     *(.progmem.gcc*)
-    *(.progmem*)
     . = ALIGN(2);
      __trampolines_start = . ;
     /* The jump trampolines for the 16-bit limited relocs will reside here.  */
@@ -98,6 +97,28 @@
      __dtors_end = . ;
     KEEP(SORT(*)(.ctors))
     KEEP(SORT(*)(.dtors))
+    *(.progmem.data*)    /* Explicitly page 0 input sections */
+    _elowtext = . ;
+    x = ASSERT (. <= 0x10000, "Error: .lowtext (128KiB limit) overflow. Try 
shrinking .progmem?") ;

Message does not fit assertion.

Anyway, I really don't know if users abuse .progmem to drop their non-volatile data that exceeds 64Ki and use GET_FAR_ADDRESS on that.

morepgmspace.h says nothing about how to locate data and is not supported by avr-libc, see avr-libc!/include/avr which only brings pgmspace.h.

The GCC docs [3] promise that

 "This attribute [progmem] works similar to the section attribute
  but adds additional checking."

so that an additional sanity checking should be in order.


+  } > text
+
+  .flash1 0x10000 :

Again this needs objcopy -j .flash1. It's okay for me because it is a new feature and we don't have to be compatible with old stuff (Makefiles and alike). However, it's likely people will miss the -j .flash1 and complain because thy don't read the docs...

+  { *(.progmem1.data*)    /* Page 1 */
+    x = ASSERT (. <= 0x10000, "Error: __flash1 (64KiB limit) overflow. Need to 
shrink it.") ;
+  } > text
+
+  .flash2 0x20000 :
+  { *(.progmem2.data*)    /* Page 2 */
+    _eflash2 = . ;
+    x = ASSERT (_eflash2 <= 0x10000, "Error: __flash2 (64KiB limit) overflow. Need 
to shrink it.") ;
+  } > text
+
+  .flash3 0x30000 :
+  { *(.progmem3.data*)    /* Page 3 */
+  } > text
+
+  .hightext ALIGN(2) :

Again, need for -j .hightext breaks existing applications.

+  {
     /* From this point on, we don't bother about wether the insns are
        below or above the 16 bits boundary.  */
     *(.init0)  /* Start here after reset.  */
@@ -146,7 +167,8 @@
     KEEP (*(.fini0))
      _etext = . ;
   }  > text
-  .data          : AT (ADDR (.text) + SIZEOF (.text))
+
+  .data          : AT (_etext)

Why _etext?  This does not take into account orphans, or does it?


Johann

--

[1]
http://sourceware.org/ml/binutils/2012-12/msg00151.html

[2]
http://savannah.nongnu.org/patch/?6352#comment0

[3]
http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html#index-g_t_0040code_007bprogmem_007d-AVR-variable-attribute-2743

diff --git a/avr6.x b/avr6.x
index 3da58e9..57782e0 100644
--- a/avr6.x
+++ b/avr6.x
@@ -76,7 +76,8 @@ SECTIONS
     KEEP(*(.vectors))
     /* For data that needs to reside in the lower 64k of progmem.  */
     *(.progmem.gcc*)
-    *(.progmem*)
+    *(.progmem)
+    *(.progmem.*)
     . = ALIGN(2);
      __trampolines_start = . ;
     /* The jump trampolines for the 16-bit limited relocs will reside here.  */
@@ -85,11 +86,17 @@ SECTIONS
      __trampolines_end = . ;
     /* For future tablejump instruction arrays for 3 byte pc devices.
        We don't relax jump/call instructions within these sections.  */
+     __jumptables_start = . ;
     *(.jumptables)
     *(.jumptables*)
+     __jumptables_end = . ;
     /* For code that needs to reside in the lower 128k progmem.  */
+     __lowtext_start = . ;
     *(.lowtext)
     *(.lowtext*)
+     __lowtext_end = . ;
+    /* From this point on, we don't bother about wether the insns are
+       below or above the 16 bits boundary.  */
      __ctors_start = . ;
      *(.ctors)
      __ctors_end = . ;
@@ -98,8 +105,6 @@ SECTIONS
      __dtors_end = . ;
     KEEP(SORT(*)(.ctors))
     KEEP(SORT(*)(.dtors))
-    /* From this point on, we don't bother about wether the insns are
-       below or above the 16 bits boundary.  */
     *(.init0)  /* Start here after reset.  */
     KEEP (*(.init0))
     *(.init1)
@@ -144,6 +149,26 @@ SECTIONS
     KEEP (*(.fini1))
     *(.fini0)  /* Infinite loop after program termination.  */
     KEEP (*(.fini0))
+     __progmemx_start = . ;
+    *(.progmemx)
+    *(.progmemx.*)
+     __progmemx_end = . ;
+     . = MAX (1 << 16, ABSOLUTE(.)) ;
+     __progmem1_start = . ;
+    *(.progmem1)
+    *(.progmem1.*)
+     __progmem1_end = . ;
+     . = MAX (2 << 16, ABSOLUTE(.)) ;
+     __progmem2_start = . ;
+    *(.progmem2)
+    *(.progmem2.*)
+     __progmem2_end = . ;
+     . = MAX (3 << 16, ABSOLUTE(.)) ;
+     __progmem3_start = . ;
+    *(.progmem3)
+    *(.progmem3.*)
+     __progmem3_end = . ;
+    . = ALIGN(2);
      _etext = . ;
   }  > text
   .data          : AT (ADDR (.text) + SIZEOF (.text))
@@ -234,3 +259,47 @@ SECTIONS
   /* DWARF Extension.  */
   .debug_macro    0 : { *(.debug_macro) }
 }
+
+
+/* Some sanity checking: .trampolines, .lowtext and .jumptables must be
+   reachable with EIND = 0, i.e. must be located withing the first
+   64Ki word chunk of flash.  */
+
+__assert_jumptables
+  = ASSERT (__jumptables_start == __jumptables_end
+            || __jumptables_end <= 1 << 17,
+            "section overflow: .jumptables");
+
+__assert_lowtext
+  = ASSERT (__lowtext_start == __lowtext_end
+            || __lowtext_end <= 1 << 17,
+            "section overflow: .lowtext");
+
+/* Startup code from avr-libc initialized EIND with hh8(pm(.vectors))
+   thus the trampolines will work if they are located in such a way that
+   EIND = hh8(pm(.trampolines))  */
+
+__assert_stubs
+  = ASSERT (__trampolines_start == __trampolines_end
+            || (ADDR (.text) >> 17 == (__trampolines_end - 1) >> 17
+                && ADDR (.text) >> 17 == __trampolines_start >> 17),
+            "section overflow: .trampolines (linker stubs)");
+
+/* Sanity check the progmem stuff: .progmemN must be located within the
+   N-th 64Ki byte chunk.  The low end needs no checking because the MAX
+   with .progmemN sets the right start address.  */
+
+__assert_flash1
+  = ASSERT (__progmem1_start == __progmem1_end
+            || __progmem1_end <= 2 << 16,
+            "section overflow: .progmem1 (__flash1)");
+
+__assert_flash2
+  = ASSERT (__progmem2_start == __progmem2_end
+            || __progmem2_end <= 3 << 16,
+            "section overflow: .progmem2 (__flash2)");
+
+__assert_flash3
+  = ASSERT (__progmem3_start == __progmem3_end
+            || __progmem3_end <= 4 << 16,
+            "section overflow: .progmem3 (__flash3)");
#ifndef TEXT
#define TEXT 0
#endif

#ifndef LOWTEXT
#define LOWTEXT 0
#endif

#ifndef STUBS
#define STUBS 0
#endif

#ifndef CTORS
#define CTORS 0
#endif

#ifndef P0
#define P0 0
#endif

#ifndef P1
#define P1 0
#endif

#ifndef P2
#define P2 0
#endif

#ifndef P3
#define P3 0
#endif

#ifndef PX
#define PX 0
#endif

#ifndef BOOT
#define BOOT 0
#endif

.macro Flash n num
    .section .progmem\n\().data.foo\n\(),"a",@progbits
    .global flash\n\()_start
    .type flash\n\()_start,@object
    flash\n\()_start:
        .fill \num
    .size flash\n\()_start, .-flash\n\()_start
.endm


Flash  , P0
Flash 1, P1
Flash 2, P2
Flash 3, P3
Flash x, PX

.text
.global main
main:
.rept TEXT / 2
    nop
.endr
    
.rept STUBS
    0:
    ldi r16, lo8(gs(0b))
    ldi r17, hi8(gs(0b))
.endr
.size main, .-main


.section .lowtext,"ax",@progbits
.global lowfunc
lowfunc:
.rept LOWTEXT / 2
    nop
.endr
.size lowfunc, .-lowfunc

#if CTORS > 0
.global __do_global_ctors
.section .ctors,"a",@progbits
    .p2align 1
    .rept CTORS / 2
    .word       gs(main+16)
    .endr
#endif /* CTORS */

#if BOOT > 0
.section .bootloader,"ax",@progbits
.global bootloader
bootloader:
    .rept BOOT / 2
    nop
    .endr
#endif /* BOOT */

reply via email to

[Prev in Thread] Current Thread [Next in Thread]