[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] target-tilegx: Execute _start and reach to __li
From: |
Chen Gang S |
Subject: |
Re: [Qemu-devel] [PATCH] target-tilegx: Execute _start and reach to __libc_start_main successfully |
Date: |
Wed, 25 Feb 2015 01:25:01 +0800 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 |
On 2/25/15 00:46, Chris Metcalf wrote:
> On 2/24/2015 11:31 AM, Chen Gang S wrote:
>> - When dst is zero, in most cases, we can just skip the insn, but in
>> some cases, it may cause exception in user mode (e.g st zero r0).
>
> Be careful of skipping instructions in general, since for example a "ld zero,
> r1" may be targeting an MMIO address that we are reading to trigger some
> device operation, but don't need the result from.
OK, thanks, I shall notice about it, next.
>
>> - For add/addi operation, it will change to mov/movi operation.
>>
>> - For mov operation, it will change to movi operation.
>>
>> - For shl3add, if srcb is zero register, it will change to shli
>> operation.
>
> I assume you are referring to missed performance optimization opportunities
> if you don't treat "zero" specially? You certainly could treat "add r1, r2,
> zero" as just an "add" instruction anyway, just with a zero addend. You
> don't have to convert it to a move instruction. Likewise with the others.
>
OK, thanks. Next, I shall not change the instruction, and will only use
tcg temporary variable instead of zero register, when I send patch v2.
>> OK, thanks. originally I wanted to reuse them, but after think of, I
>> prefer the 64-bit immediate number instead of.
>>
>> - The decoding function may be very long, but it is still a simple
>> function, I guess, it is still simple enough for readers.
>>
>> - 64-bit immediate number has better performance under 64-bit machine
>> (e.g x86-64).
>
> What I mean is you should just directly use all those accessor functions.
> Then your code would look like
>
> switch (get_Opcode_X1(bundle)) { // this is a 59-bit shift and mask by 0x7
> case SHL16INSLI_OPCODE_X1: // this is the constant 7
> [...]
> }
>
> Typically dealing with smaller integers is higher-performance on any
> platform, I suspect, even on x86 when you can have large inline constants in
> the code. The compiler might be smart enough to do this for you, to be fair.
> In any case any performance difference of this switch is almost certainly
> lost in the noise.
>
Hmm... maybe what you said is correct, but I am not quite sure.
> More to the point, named constants are almost always better for code
> maintainability than raw integers in code.
>
For me, if the raw integer is only used once, we needn't define a macro
for it (instead of, we can give a comment for it).
> Also, my point is that you should just include a verbatim copy of the kernel
> header into qemu, and then use those names. This is helpful to people who
> are already familiar with the <arch/opcode.h> naming, and reduces the amount
> of code needed to review qemu in any case.
>
OK, thanks. What you said above sounds reasonable to me, for compatible
reason, I shall use "arch/opcode_tilegx.h" totally -- it will be helpful
for the readers who are already familiar with "arch/opcode_tilegx.h".
Thanks.
--
Chen Gang
Open, share, and attitude like air, water, and life which God blessed
Re: [Qemu-devel] [PATCH] target-tilegx: Execute _start and reach to __libc_start_main successfully, Richard Henderson, 2015/02/24