[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] Prototype hand written code generator
From: |
Paul Brook |
Subject: |
[Qemu-devel] Prototype hand written code generator |
Date: |
Sat, 21 May 2005 01:14:36 +0100 |
User-agent: |
KMail/1.7.2 |
The attached patch is an initial implementation of a hand-written code
generation backend for qemu. Salient features are:
The basic principle is that (as before) guest code is broken down into simple
operations ("qOPs"). These qops use a simple three-argument model, with the
goal of being very easy to convert to native code.
Each operand is a "qreg". There are four types of qreg:
- Temporary values. An arbitrary number of these can be created. These will be
turned into either host regs or stack slots before we generate host code.
- Host "hard" registers.
- Guest state. This is areas in the CPUenv structure.
- Stack slots.
- Integer constants.
Each guest defines a number of qregs to hold guest CPU state (qregs.def).
These can either be locations in the CPUenv structure, or host registers
(aka. AREGN).
Existing (dyngen) ops are left untouched. It's assume that they require guest
state to be consistent, and that they clobber all other host registers.
Guest instruction locations are encoded explicitly into the qop stream. This
allows optimization of the qop stream while retaining precise guest CPU
exception handling.
Long-term I hope that this will totally replace dyngen. The more complicated
ops can either be broken down into simple ops, or moved into helper
functions. Most of the complicated ops are already helper functions anyway.
All references to guest CPU state from the new qops is explicit. This allows
us to do inter-op optimization and register allocation.
In particular is allows the code generator to choose where a guest register
value should be stored at any one time. This is a big win on RISC guests with
large register files because it means we can dynamically choose which subset
of guest state is held in host registers.
I've implemented a copy/constant propagation pass, which is tuned to have this
effect. This is currently the only real optimization done. I currently run
this optimization all the time. If more expensive optimizations are written
we probably want to instrument the generated code and use profile feedback to
decide when to run these optimizations.
After optimization the qop stream is transformed into two-argument from (this
is not necessary on hosts that have a three-argument instruction set). Then
register allocation is performed, assigning host registers to qregs, and
making sure all qops have no more that one non-register operand. These
constraints make it very easy to transform this into host code.
So far I've only implemented x86 host and arm guest support.
As a demonstration I've converted the arm target to use the new qops for
simple operations (mostly they are a drop-in replacement for the existing
ops). This has allowed removal the "T2" global register variable, freeing
another host register for general use.
This patch gives approximately 30% speed increase on the nbench benchmark.
Other less repetitive tasks (eg. running "gcc dyngen.c") it appears to be
performance-neutral. I'm fairly sure there is still a lot of low-hanging
fruit to be had.
I'm posting this patch for information and comment. It is not suitable for
inclusion in its current state.
- It's missing proper makefile dependencies.
- Some of the source/header files are a bit mixed up. I've put things wherever
was most convenient, rather than breaking them up logically.
- qops use a fairly high-overhead double-linked-list representaton. This is
mainly to simplify development. I think it should be possible to use a lower
overhead representation (single-linked-list or array).
- Other hosts and guests need implementing.
- Calling helper function and accessing guest memory is still done done via
dyngen. I think all the infrastructure is there to support this, it's just
not used yet.
All the above are mainly just a matter of coding. However:
- It doesn't handle 64-bit hosts or guests. I'm not sure how to handle this.
One possible solution is to give qregs a size/mode. 32-bit hosts we can then
lie about having 64-bit registers, and hack around it in the final code
generation stages. Performance of 64-bit guests on 32-bit hosts would suck,
but that's already true with the current scheme.
Suggestions are welcome
Paul
patch.qemu_qop.gz
Description: GNU Zip compressed data
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Qemu-devel] Prototype hand written code generator,
Paul Brook <=