qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH RFC 00/11] AREG0 elimination


From: Blue Swirl
Subject: Re: [Qemu-devel] [PATCH RFC 00/11] AREG0 elimination
Date: Sun, 15 May 2011 14:03:40 +0300

On Sun, May 15, 2011 at 12:27 PM, Laurent Desnogues
<address@hidden> wrote:
> On Sun, May 15, 2011 at 9:15 AM, Blue Swirl <address@hidden> wrote:
>> On Sun, May 15, 2011 at 1:04 AM, Aurelien Jarno <address@hidden> wrote:
>>> On Sun, May 15, 2011 at 12:52:35AM +0300, Blue Swirl wrote:
>>>> On Sun, May 15, 2011 at 12:16 AM, Aurelien Jarno <address@hidden> wrote:
>>>> > On Sat, May 14, 2011 at 10:35:20PM +0300, Blue Swirl wrote:
> [...]
>>>> > The env register is used very often (basically for every load/store, but
>>>> > also a lot of helpers), so it makes sense to reserve a register for it.
>>>> >
>>>> > For what I understand from your patch series, you prefer to pass this
>>>> > register explicitly to TCG functions. This basically means this TCG
>>>> > global will be loaded to host register as soon as it is used, but also
>>>> > regularly, as globals are saved back to their canonical location before
>>>> > an helper or a load/store.
>>>> >
>>>> > So it seems that this patch series will just allowing the "env register"
>>>> > to change over time, though it will not spare one more register for the
>>>> > TCG code, and it will emit longer TCG code to regularly reload the env
>>>> > global into a host register.
>>>>
>>>> But there will be one more register available in some cases. In other
>>>
>>> Inside the TCG code, it will basically happens very rarely, given
>>> load/store are really the most used instructions, and they need to load
>>> the env register.
>>
>> Not exactly, from a sample run with -d op_opt:
>> $ egrep -v -e '^$' -v -e 'OP after' -v -e ' end' -v -e 'Search PC'
>> /tmp/qemu.log | awk '{print $1}' | sort | uniq -c|sort -rn
>> 1673966 movi_i32
>>  653931 ld_i32
>>  607432 mov_i32
>>  428684 st_i32
>>  326878 movi_i64
>>  308626 add_i32
>>  283186 call
>>  256817 exit_tb
>>  207232 nopn
>>  189388 goto_tb
>>  122398 and_i32
>>  117997 shr_i32
>>  89107 qemu_ld32
>>  82926 set_label
>>  82713 brcond_i32
>>  67169 qemu_st32
>>  55109 or_i32
>>  46536 ext32u_i64
>>  44288 xor_i32
>>  38103 sub_i32
>>  26361 shl_i32
>>  23218 shl_i64
>>  23218 qemu_st64
>>  23218 or_i64
>>  20474 shr_i64
>>  20445 qemu_ld64
>>  11161 qemu_ld8u
>>  10409 qemu_st8
>>   5013 qemu_ld16u
>>   3795 qemu_st16
>>   2776 qemu_ld8s
>>   1915 sar_i32
>>   1414 qemu_ld16s
>>    839 not_i32
>>    579 setcond_i32
>>    213 br
>>     42 ext32s_i64
>>     30 mul_i64
>
> Unless I missed something, this doesn't show the usage of
> ld/st per TB, which is what Aurélien was looking for if I
> understood correctly.  All I can say is that you had at
> most 256817 TB's and 234507 qemu_ld/st, so about one per
> TB.

The question was ratio of loads/stores to other instructions. The
statistics are not per TB. There were about 174880 TBs.

> Anyway I must be thick, because I fail to see how
> generated code could access guest CPU registers without a
> pointer to the CPU env :-)
>
> IIUC the SPARC translator uses ld_i32/st_i32 mainly for
> accessing the guest CPU registers, which due to register
> windows is held in a dedicated global temp.  Is that
> correct?  If so this is kind of hiding accesses to the
> CPU env;  all other targets read/write registers by using
> CPU env (through the use global temps in most cases).
>
> So I think most (if not almost all) TB will need a pointer
> to CPU env, which is why I think Aurélien's proposal to
> keep a dedicated register that'd be loaded in the prologue
> is the only way to not degrade performance of the
> generated code (I'd add that this dedicated register
> should be the one defined by the ABI as holding the first
> parameter value, if that's possible;  I'm afraid this is
> not necessarily a good idea).

CPU env will be used, but the register could be made available for
other uses too.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]