Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines

avr-gcc-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines

From:	Georg-Johann Lay
Subject:	Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.]
Date:	Thu, 13 Dec 2012 12:24:12 +0100
User-agent:	Thunderbird 2.0.0.24 (Windows/20100228)

Erik Christiansen schrieb:

Warning:  Alternative solutions are offered for some of the sub-problems.
          Choices are clearer at the end.
          The best choice depends mostly on whether one default linker

script should handle all the use cases mooted here.The reply is a bit long. (Grab a coffee ;-)


On 11.12.12 17:47, Georg-Johann Lay wrote:

[Beautifully explanatory example of AVR memory paging snipped]

This puts x into input section .progmem1.data and expects that this section is
located appropriately, in particular that  &x div 2^16  =  1

It allows to use 16-bit addresses in order to address 0x10000...0x1ffff,
similar for other __flashN.

However, .progmem1.data is not handled by the default linker script, i.e. will
match .progmem*


The easiest part is to stop gobbling any .progmemN.data into the .text
output section. (__flash IIUC). Just change the line:

*(.progmem*)

to:

*(.progmem.data)     /* We only want page 0 stuff here. */

Also needs *(.progmem.data*) or at least *(.progmem.data.*) because withavr-gcc 4.7 up progmem is sensitive to -fdata-sections. I found no easyway to have something like -mprogmem-sections that is selective and onlyaffects progmem in a way similar to -fdata-sections.

The GCC code, in particular varasm.c, is horrible. It has hooks but assoon as you want one bit more than other targets need you are stuck...

There are several ways to place the .progmemN.data input sections where
we want them, one way is to just grow the "text" memory segment in the
linker MEMORY model:

MEMORY
{
  text   (rx)   : ORIGIN = 0, LENGTH = 190K        /* 3 x 64k pages */
  boot   (rx)   : ORIGIN = 62K, LENGTH = 2K
  data   (rw!x) : ORIGIN = 0x800100, LENGTH = 4K
  eeprom (rw!x) : ORIGIN = 0x810000, LENGTH = 2K
}

Now we tweak the end of the .text output section with:

   *(.progmem1.data)    /* Page 1 */
   *(.progmem2.data)    /* Page 2 */
   *(.progmem3.data)    /* Page 3 */
   _etext = . ;

__code_end = .} > text


(Or put the higher pages before the destructors?)

There are also restrictions to ctors / dtors that come from the startupcode bits from libgcc.

If that is done, then we have to engineer all page overflow erroring
ourselves, perhaps as described later. But we are cheating ourselves of
ld's help. If we let ld in on the secret of the memory model, then it
can detect page overflow without any effort from us. (Skip this option
if the __memx use case has to be handled as well, by the one linker script.)

We just need a separate memory segment for each physical page,
e.g. text, flash1, flash2:

MEMORY{

  text   (rx)   : ORIGIN = 0, LENGTH = 62K

I don't think that is helpful, except we want a linker script for eachscenario which basically means that the user has to write her script andjuggle with .text, .progmem, .progmemN, .lowtext, .trampolines, ctors,dtors, bootloader, whatever.

The AVR memory model is complicated thanks to its harvardness, but stillthe users expect that -mmcu=mydevice produces perfect executable withhowever many program code or however many progmem data they stuff intotheir sources.

If you propose to accommodate for a specific setup by means of a customlinker script, you will immediately get the response (e.g. on avrfreaks)"linker script is too complicated for the user, don't propose that,everything must run out of the box."

I understand that out-of-the-box solutions are convenient, but as thenumber of kludges increase towards oo and the number of supportersdecrease towards 0, it will lead to frustration sooner or later.

For example, we have around 200 -mmcu= variants in the compiler and inthe binutils and in the libc, and nobody ever dared to work out analternative solution to the insane-number-of-mmcu scheme. End of rant ;-)

  boot   (rx)   : ORIGIN = 62K, LENGTH = 2K
  flash1 (rx)   : ORIGIN = 64K, LENGTH = 64K
  flash2 (rx)   : ORIGIN = 128K, LENGTH = 64K

data (rw!x) : ORIGIN = 0x800100, LENGTH = 4Keeprom (rw!x) : ORIGIN = 0x810000, LENGTH = 2K

}

(I would though, name these physical-world entities page0, page1, page2


NACK, we will see similar things for RAM, i.e. it is paged, too.

for user readability. There is no good reason why the .text ouput
section name has to infect the memory model. ;-) Then we would have:MEMORY{
  page0  (rx)   : ORIGIN = 0, LENGTH = 62K
  boot   (rx)   : ORIGIN = 62K, LENGTH = 2K

What about different bootloader layouts? Bootloader at end of flash? Nobootloader at all?

  page1  (rx)   : ORIGIN = 64K, LENGTH = 64K
  page2  (rx)   : ORIGIN = 128K, LENGTH = 64K

data (rw!x) : ORIGIN = 0x800100, LENGTH = 4Keeprom (rw!x) : ORIGIN = 0x810000, LENGTH = 2K

}

Now, instead of the tweak at the end of the .text output section, we
add new output sections after the end of that section:

.flash1 :
{  *(.progmem1.data)    /* Page 1 */
} > page1

.flash2 :
{  *(.progmem2.data)    /* Page 2 */
} > page2

Now ld will automatically detect page overflows. (But as we discover
later, there's a contrary use case which doesn't want that.)

The preferred behavior is:

- Locate .progmem1.data at 0x10000


A third way that can be done is by setting the VMA in an output section
used with the original memory model, e.g.:

.flash1 0x10000 :
{  *(.progmem1.data)    /* Page 1 */
} > text


Basically anything works where:

o  .progmemN.data is empty    --> don't care for overlaps
o  .progmemN.data is nonempty --> must be subset of [0xN0000..0xNffff]

For example .text may overlap .progmemN.data provided it still satisfiesthe subset constraint. This is just the requirements, independent of itis possible to describe that in the script.


And again, notice the effect of -fdata-sections on __flashN section names.

It is the __memx use case which constrains our choice of method for
solving this easy matter.

Currently, .progmem is at the low end and .text atop because it is easyto run code at the high addresses thank to the linker stubs.

The only thing that is needed is that the stubs are in the first textsegment of 16-bit words, i.e. byte address 0x0..0x1ffff, i.e. EIND = 0.It works with .trampolines in other word segments, but then thatlocation must be made explicit in the script and EIND set appropriatelyin the startup code. AFAIR what don't work is shifting .trampolines outof place because .progmem is too big.

- Complain if data exceeds .progmem1.data and enters .progmem2.data


If the __memx use case does blithely overflow page boundaries without
error, so that we can't use separate memory segments and automatic
overflow detection, then the simplest linker script syntax (that I know
of) which achieves that is this assertion placed at the end of the
flash1 output section:

__memx was designed that way (after strong protest from Jan when Iobjected that __memx would produce too bloaty code). If __memx isneeded, very likely code size is no issue. I cannot tell to what degreespeed is an issue. Notice __memx can also used to access RAM.

flash1_test = ASSERT( . < 0x1ffff, "2nd 64k page (__flash1) overflow!");

Even if nothing else is changed to support .progmemN, such assertionsare strongly indicated and we should have them, IMHO.

Currently, nobody uses __flashN. *If* someone used __flashN, he wouldget broken code (users typically don't read the docs and won't writetheir ld scripts). Because nobody ever complained, the conclusion isnobody ever used __flashN ;-)

Alternatively, the following should do the same, is more explicit, and
can be placed anywhere after the section.:

   flash1_size_test = ASSERT( SIZEOF(.progmem1.data) < 0x10000, \
                      "2nd 64k page (__flash1) overflow!");

We just need to suppress these errors if __memx is used, AIUI.


The assertion to .progmemN still applies even if __memx is used.

I am not sure if all usage scenarios can be supported without raisingunresolvable conflicts. The compiler just drops data as the user tellsit, one reason is my limited knowledge of ld script capabilities.

Maybe it's even the case that we have to make __flashN and __memxmutually exclusive. Maybe we can have something like -mmodel=foo to getproper checks and code for a specific layout in the case where there isno one-fits-all script and the models are reasonably common.

Problem is that the tool dependency turn-around cycle is slow and willtake 1 year or even more...

- Similar for .text


Ah, yes: .text , "1st 64k page ..."


No no.  .text should not be limited.

- Print diagnostics that are comprehensible, i.e. mention the input
sections, that they overlap, and the symbol and object file that
trigger the overlap.


By the time ld is locating, the input sections are not even a dim
memory, IIRC. We would be told of the overlap, and can expect the memory
segment to be named in automatic overlap detection.

Sounds reasonable and helpful. Ditto for trampolines and maybe alsoctors / dtors.

In my experience the symbol and file can usually only be inferred from
examining the memory map for an input section which has suddenly grown.
The guff which falls off the end is usually the victim of shoving from
behind. Who but the designer can say which lump of the program is taking
too much room? I look at the map file in these cases, to find what has
suddenly grown obese.

- If .text spans from 0x0 to, say, 0x2aaaa, that's fine provided
.progmem1.data and .progmem2.data are empty.


Oh. That's quite a contrary use case, compared to what precedes. It
constrains our freedom a bit, if we want it all in one default script.
We have to _not_ use a memory segment per page, just expand "text", in
order to stop ld from automatically complaining about page overflow.
Then we can perhaps formulate such a test as:

with_the_lot = ASSERT( SIZEOF(.text) <= 0x30000 && \SIZEOF(.progmem1.data) == 0 && \

                  SIZEOF(.progmem2.data) == 0,
                  "text segment (__memx) overflow!");


If there is such logic, we can express anything we want :-)

First step is to barf with comprehensible diagnostic if any constraintis violated.


Second step is to reduce barfs to a minimum.

That is compatible with separate size assertions for .progmemN.data, but
how to choose whether to allow .progmem.data to silently grow?
(__flash, __flash1 vs __memx) Use an avr-gcc commandline define (-D) to
manually select? Or use non-zero __flash1 to decide?

The __flashN spaces were introduced with __flash and because they are not much
more work than __flash.  I don't know if anybody is using that.


If there were no use case allowing several physical pages to be treated
as one, then all __flashN could be handled as easily as __flash1. It is
only the __memx case, which constrains our choice of linker script
design, IIUC.

Alternative is __memx which implements 24-bit pointers.  Notice that
address-arithmetic is still 16-bit arithmetic.  Using __memx with the code
above, e.g.


... [example elided]

Notice x is in .progmem.data now and an access function is used because hlo8(x)
is not known at compile time.

This means __memx can be regarded as extension of __flash.


Yes, that's the use case which militates against expressing the flash
pages as separate memory segments. With 24-bit pointers, it is legal to
overflow physical pages? If so, the examples above which use a common

Yes. Goal is to have objects that overlap page boundaries and thecompiler generates code that can read across a boundary. I.e. you canhave a float located at 0xffff..0x10002 and reading shall work out ofthe box.

text memory segment, but separate flashN output sections, look more
attractive.

Again, it is reasonable to locate .trampolines early.  Because there
is no limitation for .progmem.data except that it must not exceed
0x7fffff because higher addresses are taken as RAM or I/O locations.


So long as early trampolines are reachable by all their users, such
placement can avoid later surprises when code grows. (Having them move
hither and yon isn't my cup of tea.)

As said, it's already helpful to have comprehensible diagnose if.trampolines is pushed across a 17-bit boundary.

I would enjoy formulating a linker script to handle the various use
cases. I have not paused tonight to update my avr-gcc, but will do so.
Then I can tweak the latest default script, and test any offering before
inflicting it on the unsuspecting. Some iteration is to be expected, and
correction of any misapprehension on my part would be gratefully
received.


You intend to contribute the script to binutils?

Johann

[Prev in Thread]

Current Thread

[Next in Thread]

[avr-gcc-list] Current patch queue against binutils upstream, Bastien ROUCARIES, 2012/12/10
- Re: [avr-gcc-list] Current patch queue against binutils upstream, Georg-Johann Lay, 2012/12/10
  - Re: [avr-gcc-list] Current patch queue against binutils upstream, Jan Waclawek, 2012/12/10
    - Re: [avr-gcc-list] .trampolines location. Was: Current patch queue against binutils upstream, Georg-Johann Lay, 2012/12/10
    - Re: [avr-gcc-list] .trampolines location. Was: Current patch queue against binutils upstream, Jan Waclawek, 2012/12/10
    - [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.], Erik Christiansen, 2012/12/11
    - Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.], Georg-Johann Lay, 2012/12/11
    - Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.], Erik Christiansen, 2012/12/12
    - Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.], Erik Christiansen, 2012/12/13
    - Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.], Georg-Johann Lay, 2012/12/13
    - Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.], Georg-Johann Lay <=
    - [avr-gcc-list] Linker script patch to handle __flashN [Was: Handling __flash1 and .trampolines], Erik Christiansen, 2012/12/13
    - Re: [avr-gcc-list] Linker script patch to handle __flashN [Was: Handling __flash1 and .trampolines], Georg-Johann Lay, 2012/12/13
    - Re: [avr-gcc-list] Linker script patch to handle __flashN, Erik Christiansen, 2012/12/15
  - Re: [avr-gcc-list] Current patch queue against binutils upstream, Bastien ROUCARIES, 2012/12/11
    - Re: [avr-gcc-list] Current patch queue against binutils upstream, Georg-Johann Lay, 2012/12/11
    - Message not available
    - Re: [avr-gcc-list] Current patch queue against binutils upstream, Georg-Johann Lay, 2012/12/11
    - Re: [avr-gcc-list] Current patch queue against binutils upstream, Bastien ROUCARIES, 2012/12/11

Prev by Date: Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.]
Next by Date: [avr-gcc-list] Linker script patch to handle __flashN [Was: Handling __flash1 and .trampolines]
Previous by thread: Re: [avr-gcc-list] Handling __flash1 and .trampolines [Was: .trampolines location.]
Next by thread: [avr-gcc-list] Linker script patch to handle __flashN [Was: Handling __flash1 and .trampolines]
Index(es):
- Date
- Thread