[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gcl-devel] Re: mips64 assembler
From: |
Camm Maguire |
Subject: |
[Gcl-devel] Re: mips64 assembler |
Date: |
Mon, 20 Sep 2010 15:44:02 -0400 |
User-agent: |
Gnus/5.11 (Gnus v5.11) Emacs/22.2 (gnu/linux) |
David Daney <address@hidden> writes:
> On 09/17/2010 01:44 PM, Camm Maguire wrote:
>> Greetings!
>>
>> David Daney<address@hidden> writes:
>>
>>> On 09/17/2010 07:16 AM, Camm Maguire wrote:
>>>> Greetings! Is there anyway to load a known 64bit number into a given
>>>> register in two instructions?
>>>
>>> Not in the general case where the value of the 64-bit number is
>>> unconstrained...
>>>
>>>> Said number is guaranteed to be within
>>>> 32bits of the current value of another register.
>>>
>>> In other words, you want to add an arbitrary 32-bit constant to the
>>> value in a register. You would need three instructions to do this.
>>> Two to generate the 32-bit constant and another to do the addition.
>>>
>>> David Daney.
>>>
>>
>> Alas, this was as I had expected. Perhaps you can suggest a course of
>> action.
>>
>> On mips only, there is no plt support -- executables instead have
>> .MIPS.stubs entries for lazy relocations to external symbols. Problem
>> is, these are only callable if the gp register is left at its
>> canonical position. I need to load, relocate, and execute code which
>> might call these functions, which I currently redirect to the stub.
>> This means that any .got references to addresses in the code to be
>> relocated, which will of course not be in the global .got table, have
>> to be patched to immediate addressess, which on mips32 is easy
>> enough -- ld v0,oooo(gp) -> lui v0,hhhh. This won't work on mips64.
>>
>
> PLT support works with the n32 ABI (with new toolchains). Can you use that?
-mabi=n32 -mplt still seems to generate a .MIPS.stubs section
requiring canonical gp register setting (gcc 4.4.5). Am I missing
something?
>
> I am missing part of the puzzle. ld.so handles all of this, why can't
> you let it do its job?
>
The general setting is that there is a fully linked executable which
when run, has the ability to load, relocate, and execute new code in
.o files. Furthermore, the running program can be saved to disk via
unexec and reexecuted later, possibly on a different machine. Calls in
the .o files t be loaded to symbols in shared libraries cannot be set
to the current address of the symbol, as this might not be persistent
across image saves and reexecution. Relocating instead to a
preexisting stub in the base executable takes advantage of ld.so's
lazy relocation on first execution, and, as the target address lies in
the image itself, is persistent across image saves.
>
>> This seems to indicate to me that I will need to craft my own lazy
>> relocation stub for each call to a shared lib symbol at the end of
>> each loaded block of code. Then I can mode the gp pointer to a local
>> .got table as well. This is unfortunate, but can be done. Two
>> questions remain:
>>
>> 1) Is there an alternative, e.g. some flag like -mplt to generate a
>> genuine .plt section in the base executable, or other way out?
>>
>
> You haven't specified at a high level what problem you are trying to solve.
>
1) If I am to make use of the base executable stub to say _setjmp, I
have to leave the gp pointer in its canonical position in the newly
loaded code, because the format of the .MIPS.stub (in contrast to the
.plt stub elsewere) requires this.
2) Therefore all .got references in the newly loaded code have to
exist in the .got table of the base executable, thereby excluding
addresses in the newly loaded code.
3) On mips64, in contrast to mips32, I cannot overwrite .got
references to addresses in the newly loaded code to be immediate
address references instead, as it takes too many instructions.
4) It appears that I have three broad options:
a) Make my own .got table at the end of the newly loaded code, and
append with my own lazy stub when necessary. For example, on
alpha, we create our own .got in this manner due to the 64bit
issue, but we don't have to make our own stub as the alpha has a
callable .plt stub making no gp register value assumptions.
b) Do a) above but get a working .plt with some compiler flag
settings, obviating the need to a local stub.
c) find some other way, perhaps with compiler flags, to eliminate
.got references to local addresses in the newly loaded code. In
other words, if I could instruct gcc to write accesses to the .data
section of the newly loaded code as a 32bit offset from the .text
section address, instead of a .got load and offset, I'd be set.
[ e.g.
0000000000000000 <init_code>:
0: 67bdffe0 daddiu sp,sp,-32
4: ffbf0010 sd ra,16(sp)
8: ffbe0008 sd s8,8(sp)
c: ffbc0000 sd gp,0(sp)
10: 03a0f02d move s8,sp
14: 3c1c0000 lui gp,0x0
18: 0399e02d daddu gp,gp,t9
1c: 679c0000 daddiu gp,gp,0
20: df820000 ld v0,0(gp) <-- data address page load, cannot
be written as lui on 64bit
24: 64420000 daddiu v0,v0,0 <-- data address offset
28: 0040202d move a0,v0
2c: df990000 ld t9,0(gp)
30: 0320f809 jalr t9
34: 00000000 nop
38: 03c0e82d move sp,s8
3c: dfbf0010 ld ra,16(sp)
40: dfbe0008 ld s8,8(sp)
44: dfbc0000 ld gp,0(sp)
48: 67bd0020 daddiu sp,sp,32
4c: 03e00008 jr ra
50: 00000000 nop
]
It looks like a) is the best, though it will require mips only
modifications to the generic elf loading code, which is very
unfortunate.
>> 2) I don't completely understand the stub:
>>
>> -> 12010e090: df998010 ld t9,-32752(gp)
>> 12010e094: 03e0782d move t3,ra
>> 12010e098: 0320f809 jalr t9
>> 12010e09c: 641807c6 daddiu t8,zero,1990
>> -> 12010e0a0: df998010 ld t9,-32752(gp)
>> 12010e0a4: 03e0782d move t3,ra
>> 12010e0a8: 0320f809 jalr t9
>> 12010e0ac: 641807c5 daddiu t8,zero,1989
>>
>> -> denotes stub entry points. How does the add ever get called? This
>> add contains the only reference to the .got entry of the external
>> symbol. It appears that it should be called before the jump.
>
> On MIPS the instruction after a branch or jump is executed as part of
> the control transfer instruction. This called the Delay Slot.
>
> t9 is loaded with the address of the lazy resolver. Return address
> saved into t3, symbol index loaded into t8, make the call to the lazy
> resolver via t9 ...
>
Thank you! This was especially helpful!
Take care,
>
>>
>> Thanks so much.
>
>
>
>
>
--
Camm Maguire address@hidden
==========================================================================
"The earth is but one country, and mankind its citizens." -- Baha'u'llah