[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Prototype hand written code generator

From: Paul Brook
Subject: [Qemu-devel] Prototype hand written code generator
Date: Sat, 21 May 2005 01:14:36 +0100
User-agent: KMail/1.7.2

The attached patch is an initial implementation of a hand-written code 
generation backend for qemu. Salient features are:

The basic principle is that (as before) guest code is broken down into simple 
operations ("qOPs"). These qops use a simple three-argument model, with the 
goal of being very easy to convert to native code.

Each operand is a "qreg". There are four types of qreg:
- Temporary values. An arbitrary number of these can be created. These will be 
turned into either host regs or stack slots before we generate host code.
- Host "hard" registers.
- Guest state. This is areas in the CPUenv structure.
- Stack slots.
- Integer constants.

Each guest defines a number of qregs to hold guest CPU state (qregs.def). 
These can either be locations in the CPUenv structure, or host registers 
(aka. AREGN).

Existing (dyngen) ops are left untouched. It's assume that they require guest 
state to be consistent, and that they clobber all other host registers.

Guest instruction  locations are encoded explicitly into the qop stream. This 
allows optimization of the qop stream while retaining precise guest CPU 
exception handling.

Long-term I hope that this will totally replace dyngen. The more complicated 
ops can either be broken down into simple ops, or moved into helper 
functions. Most of the complicated ops are already helper functions anyway.

All references to guest CPU state from the new qops is explicit. This allows 
us to do inter-op optimization and register allocation.

In particular is allows the code generator to choose where a guest register 
value should be stored at any one time. This is a big win on RISC guests with 
large register files because it means we can dynamically choose which subset 
of guest state is held in host registers.

I've implemented a copy/constant propagation pass, which is tuned to have this 
effect. This is currently the only real optimization done. I currently run 
this optimization all the time. If more expensive optimizations are written 
we probably want to instrument the generated code and use profile feedback to 
decide when to run these optimizations.

After optimization the qop stream is transformed into two-argument from (this 
is not necessary on hosts that have a three-argument instruction set). Then 
register allocation is performed, assigning host registers to qregs, and 
making sure all qops have no more that one non-register operand. These 
constraints make it very easy to transform this into host code.

So far I've only implemented x86 host and arm guest support. 
As a demonstration I've converted the arm target to use the new qops for 
simple operations (mostly they are a drop-in replacement for the existing 
ops). This has allowed removal the "T2" global register variable, freeing 
another host register for general use.

This patch gives approximately 30% speed increase on the nbench benchmark. 
Other less repetitive tasks (eg. running "gcc dyngen.c") it appears to be 
performance-neutral. I'm fairly sure there is still a lot of low-hanging 
fruit to be had.

I'm posting this patch for information and comment. It is not suitable for 
inclusion in its current state.

- It's missing proper makefile dependencies.
- Some of the source/header files are a bit mixed up. I've put things wherever 
was most convenient, rather than breaking them up logically.
- qops use a fairly high-overhead double-linked-list representaton. This is 
mainly to simplify development. I think it should be possible to use a lower 
overhead representation (single-linked-list or array).
- Other hosts and guests need implementing.
- Calling helper function and accessing guest memory is still done done via 
dyngen. I think all the infrastructure is there to support this, it's just 
not used yet.

All the above are mainly just a matter of coding. However:

- It doesn't handle 64-bit hosts or guests.  I'm not sure how to handle this.

One possible solution is to give qregs a size/mode. 32-bit hosts we can then 
lie about having 64-bit registers, and hack around it in the final code 
generation stages. Performance of 64-bit guests on 32-bit hosts would suck, 
but that's already true with the current scheme.

Suggestions are welcome


Attachment: patch.qemu_qop.gz
Description: GNU Zip compressed data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]