qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] TCG: AREG0 removal planning


From: Richard Henderson
Subject: Re: [Qemu-devel] TCG: AREG0 removal planning
Date: Tue, 10 May 2011 14:58:51 -0700
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc14 Thunderbird/3.1.10

On 05/10/2011 01:54 PM, Blue Swirl wrote:
> TCG the generator backend
> -AREG0 is used for qemu_ld/st ops for TLB access. It should be
> possible for the translators to pass instead a pointer to either
> CPUState or directly to the TLB.

I believe that AREG0 should continue to be present in the generated
code.  There are simply too many references to it throughout the
translated code for allocating this dynamically to be a win.

What should change, however, is the removal of AREG0 outside the
generated code.  The cpu-state pointer should be passed as a regular
parameter wherever it is required.  This includes tcg_qemu_tb_exec,
which means that the generated prologue would change, setting up
AREG0 in the process.

> New qemu_ld/st ops are needed for all TCG targets.

Yes, qemu_ld/st would have to change to accommodate the new parameter
being passed.

While we're at it, let us change things a bit further to allow guest
byte-swap load/store insns to be implemented more efficiently.  For
instance, currently a sparc load_asr (little-endian), as emulated on
an x86 host, does the byte swap twice.

There is, currently, a const int parameter to qemu_ld/st that encodes
the size of the load.  Almost all TCG backends behind the scenes 
extend this parameter with a bit to indicate byte swap needed.  Let us
formalize this, and allow this to be set in the original TCG op, with
appropriate new inlines in tcg-op.h to access it from the translators.

We can also make things easier for the backends by allowing them
to declare that they do or do not have byte swap load/store insns.
If the such are not available, a separate bswap opcode is emitted
right from tcg_gen_qemu_st32 et al.

This would allow a nice cleanup for i386, which currently has a small
register allocation problem in the store path, what with needing to
not clobber the input register while byte swapping.  (This problem is
solved by restricting the set of input registers for qemu_ld/st.)

All this does require the slow path to be changed to accommodate this.
In particular, if byte-swap memory ops are available, we need slow
path functions that also byte swap.  Indeed, I'd expect them to use
the byte-swap memory ops themselves.  Further, if byte-swap memory
ops are not available, the slow path should always return memory in
the host byte order, because a separate bswap operation will be done
on behalf of the fast path.

> -TCG temps are stored in CPUState field temp_buf[], accessed via
> AREG0. Maybe a regular stack frame should be allocated instead?

Probably.  Most of the backends manage a stack frame anyway, to
handle registers saved in the prologue.  All that would be needed
is a define from TCG to tell the backends how much memory is required,
and some value passed from the backends to tell TCG what the offset
of that area is from the stack pointer.


r~



reply via email to

[Prev in Thread] Current Thread [Next in Thread]