gcl-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gcl-devel] Re: mips64 assembler


From: Camm Maguire
Subject: [Gcl-devel] Re: mips64 assembler
Date: Mon, 20 Sep 2010 15:44:02 -0400
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux)

David Daney <address@hidden> writes:

> On 09/17/2010 01:44 PM, Camm Maguire wrote:
>> Greetings!
>>
>> David Daney<address@hidden>  writes:
>>
>>> On 09/17/2010 07:16 AM, Camm Maguire wrote:
>>>> Greetings!  Is there anyway to load a known 64bit number into a given
>>>> register in two instructions?
>>>
>>> Not in the general case where the value of the 64-bit number is
>>> unconstrained...
>>>
>>>> Said number is guaranteed to be within
>>>> 32bits of the current value of another register.
>>>
>>> In other words, you want to add an arbitrary 32-bit constant to the
>>> value in a register.  You would need three instructions to do this.
>>> Two to generate the 32-bit constant and another to do the addition.
>>>
>>> David Daney.
>>>
>>
>> Alas, this was as I had expected.  Perhaps you can suggest a course of
>> action.
>>
>> On mips only, there is no plt support -- executables instead have
>> .MIPS.stubs entries for lazy relocations to external symbols.  Problem
>> is, these are only callable if the gp register is left at its
>> canonical position.  I need to load, relocate, and execute code which
>> might call these functions, which I currently redirect to the stub.
>> This means that any .got references to addresses in the code to be
>> relocated, which will of course not be in the global .got table, have
>> to be patched to immediate addressess, which on mips32 is easy
>> enough -- ld v0,oooo(gp) ->  lui v0,hhhh.  This won't work on mips64.
>>
>
> PLT support works with the n32 ABI (with new toolchains).  Can you use that?

-mabi=n32 -mplt still seems to generate a .MIPS.stubs section
 requiring canonical gp register setting (gcc 4.4.5).  Am I missing
 something? 

>
> I am missing part of the puzzle.  ld.so handles all of this, why can't
> you let it do its job?
>

The general setting is that there is a fully linked executable which
when run, has the ability to load, relocate, and execute new code in
.o files.  Furthermore, the running program can be saved to disk via
unexec and reexecuted later, possibly on a different machine. Calls in
the .o files t be loaded to symbols in shared libraries cannot be set
to the current address of the symbol, as this might not be persistent
across image saves and reexecution.  Relocating instead to a
preexisting stub in the base executable takes advantage of ld.so's
lazy relocation on first execution, and, as the target address lies in
the image itself, is persistent across image saves.


>
>> This seems to indicate to me that I will need to craft my own lazy
>> relocation stub for each call to a shared lib symbol at the end of
>> each loaded block of code.  Then I can mode the gp pointer to a local
>> .got table as well.  This is unfortunate, but can be done.  Two
>> questions remain:
>>
>> 1) Is there an alternative, e.g. some flag like -mplt to generate a
>> genuine .plt section in the base executable, or other way out?
>>
>
> You haven't specified at a high level what problem you are trying to solve.
>

1) If I am to make use of the base executable stub to say _setjmp, I
have to leave the gp pointer in its canonical position in the newly
loaded code, because the format of the .MIPS.stub (in contrast to the
.plt stub elsewere) requires this.  

2) Therefore all .got references in the newly loaded code have to
exist in the .got table of the base executable, thereby excluding
addresses in the newly loaded code.

3) On mips64, in contrast to mips32, I cannot overwrite .got
references to addresses in the newly loaded code to be immediate
address references instead, as it takes too many instructions.

4) It appears that I have three broad options:

   a) Make my own .got table at the end of the newly loaded code, and
   append with my own lazy stub when necessary.  For example, on
   alpha, we create our own .got in this manner due to the 64bit
   issue, but we don't have to make our own stub as the alpha has a
   callable .plt stub making no gp register value assumptions.

   b) Do a) above but get a working .plt with some compiler flag
   settings, obviating the need to a local stub.

   c) find some other way, perhaps with compiler flags, to eliminate
   .got references to local addresses in the newly loaded code.  In
   other words, if I could instruct gcc to write accesses to the .data
   section of the newly loaded code as a 32bit offset from the .text
   section address, instead of a .got load and offset, I'd be set.

[ e.g.

0000000000000000 <init_code>:
   0:   67bdffe0        daddiu  sp,sp,-32  
   4:   ffbf0010        sd      ra,16(sp)
   8:   ffbe0008        sd      s8,8(sp)
   c:   ffbc0000        sd      gp,0(sp)
  10:   03a0f02d        move    s8,sp
  14:   3c1c0000        lui     gp,0x0
  18:   0399e02d        daddu   gp,gp,t9
  1c:   679c0000        daddiu  gp,gp,0
  20:   df820000        ld      v0,0(gp)    <-- data address page load, cannot 
be written as lui on 64bit
  24:   64420000        daddiu  v0,v0,0     <-- data address offset
  28:   0040202d        move    a0,v0
  2c:   df990000        ld      t9,0(gp)
  30:   0320f809        jalr    t9
  34:   00000000        nop
  38:   03c0e82d        move    sp,s8
  3c:   dfbf0010        ld      ra,16(sp)
  40:   dfbe0008        ld      s8,8(sp)
  44:   dfbc0000        ld      gp,0(sp)
  48:   67bd0020        daddiu  sp,sp,32
  4c:   03e00008        jr      ra
  50:   00000000        nop

]


It looks like a) is the best, though it will require mips only
modifications to the generic elf loading code, which is very
unfortunate. 


>> 2) I don't completely understand the stub:
>>
>> ->    12010e090:     df998010        ld      t9,-32752(gp)
>>       12010e094:     03e0782d        move    t3,ra
>>       12010e098:     0320f809        jalr    t9
>>       12010e09c:     641807c6        daddiu  t8,zero,1990
>> ->    12010e0a0:     df998010        ld      t9,-32752(gp)
>>       12010e0a4:     03e0782d        move    t3,ra
>>       12010e0a8:     0320f809        jalr    t9
>>       12010e0ac:     641807c5        daddiu  t8,zero,1989
>>
>> ->  denotes stub entry points.  How does the add ever get called?  This
>> add contains the only reference to the .got entry of the external
>> symbol.  It appears that it should be called before the jump.
>
> On MIPS the instruction after a branch or jump is executed as part of
> the control transfer instruction.  This called the Delay Slot.
>
> t9 is loaded with the address of the lazy resolver.  Return address
> saved into t3, symbol index loaded into t8, make the call to the lazy
> resolver via t9 ...
>

Thank you!  This was especially helpful!

Take care,

>
>>
>> Thanks so much.
>
>
>
>
>

-- 
Camm Maguire                                        address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



reply via email to

[Prev in Thread] Current Thread [Next in Thread]