qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] TCG: AREG0 removal planning


From: Stefan Weil
Subject: Re: [Qemu-devel] TCG: AREG0 removal planning
Date: Tue, 10 May 2011 23:31:20 +0200
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20110307 Iceowl/1.0b1 Icedove/3.0.11

Am 10.05.2011 22:54, schrieb Blue Swirl:
Hi,

TCG uses a fixed global register (AREG0) to which points to currently
used CPUState, also known as 'env'. Using a fixed register has the
downsides that the register must be reserved by TCG for generated code
and by the compiler for compiling a few critical files (op_helper.c
etc.). The latter also means that any calls to C library may be unsafe
from those files.

Here are my sketches about transition to AREG0-less world.

TCG the generator backend
-AREG0 is used for qemu_ld/st ops for TLB access. It should be
possible for the translators to pass instead a pointer to either
CPUState or directly to the TLB. New qemu_ld/st ops are needed for all
TCG targets.
-TCG temps are stored in CPUState field temp_buf[], accessed via
AREG0. Maybe a regular stack frame should be allocated instead?
-A generator can support translators with and without AREG0 without
performance impact and minor #ifdeffery.

Translators/op helpers
-Op helpers should not use global 'env' but take a CPUState or more
specific pointer instead. The converted helpers should be moved from
op_helper.c to helper.c. As Paul suggested, a new TCG_CALL_TYPE could
be added.
-cpu_env: still needed. Maybe a new TCG temp type should be added?
Special magic temp set up by prologue?
-New qemu_ld/st ops must be used. This breaks the translator when used
with the TCG backends that aren't yet converted, except with heavy
#ifdeffery.
-When a translator is completely AREG0 free, the register can be freed
for normal allocations.
-Performance will degrade until AREG0 usage is eliminated, then
performance should be better than now.

Other TCG execution loop (cpu-exec.c)
-Convert global 'env' use.
-It may be a bit tricky (some #ifdeffery) to support simultaneously
targets that use AREG0 and those that don't.
-Performance may degrade for the QEMU targets that still use AREG0.

Staging
-Ideally generators should be converted first to avoid breakages.
-In practice, breakages and performance regressions will be hard to avoid.

Cleanups
-HELPER_CFLAGS will be eliminated.
-All of QEMU will be plain ordinary C code for easier portability.
-Buggy C libraries (Sparc glibc longjmp destroying %g registers...)
won't be able to mess with global registers, no restrictions about C
library calls from op helpers are needed.
-dyngen_exec.h can be finally eliminated.
-Since qemu_ld/st uses need fixing anyway, the ops can be refactored
as well, for example taking Aurelien's constant patches into account.
-Generic softfloat and other common ops could be added, called
directly without a helper.

History
-In dyngen times, there used to be three global registers, AREG0 to
AREG2 (even more were defined, but not used).
-AREG1 and AREG2 were known as T1 and T2. Dyngen ops used these directly.

Comments? There are a few blank spots too.

Just for information:

TCI (the TCG interpreter) uses a global variable (no register,
because it is designed to run on any architecture).

Code extracts:

dyngen-exec.h:
#if defined(AREG0)
# define DECLARE_QEMU_ENV(s) register struct s *env asm(AREG0)
#else
# define DECLARE_QEMU_ENV(s) extern struct s *env
#endif

target-i386/exec.h (similar for other targets):
DECLARE_QEMU_ENV(CPUX86State);

Your plan might allow co-existence of the normal TCG and TCI in
the same binary, so users could select TCI via a command line
switch (there are a few use cases where TCI has advantages).

It is also a small step towards a multicore emulation which makes
full use of real multicore cpus.

I have some regression tests here (mainly debian linux guests
with different architectures, but also an unspeakable host os
which some people don't like) and will certainly run these tests
when required.

And yes, you will discover more blank spots when you do the work.

Regards,
Stefan




reply via email to

[Prev in Thread] Current Thread [Next in Thread]