[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [GSOC] Improving Measurement of Tiny Code Generation Qu
From: |
Alex Bennée |
Subject: |
Re: [Qemu-devel] [GSOC] Improving Measurement of Tiny Code Generation Quality |
Date: |
Wed, 27 Mar 2019 12:18:05 +0000 |
User-agent: |
mu4e 1.1.0; emacs 26.1 |
Vanderson Martins do Rosario <address@hidden> writes:
> Hi everyone,
>
> I’m sending this email to present myself and a project that I’m planning to
> submit to QEMU/GSoC this year. As I see that this project could result in
> useful improvements and tools to the community, I would love to have
> feedback from the community and perhaps make it even more useful.
>
<snip>
>
> Project: Improving Measurement of Tiny Code Generation Quality.
> Mentor: Alex Bennée
>
> I/ Introduction
>
> In most applications, the majority of the execution time is spent in a very
> small portion of code. Regions of a code which have high-frequency
> execution are called hot while all other regions are called cold. As a
> direct consequence, emulators also spent most of their execution time
> emulating these hot regions and, so, dynamic compilers and translators need
> to pay extra attention to them. To guarantee that these hot regions are
> compiled/translated generating high-quality code is fundamental to achieve
> a final high-performance emulation. Thus, one of the most important steps
> in tuning an emulator performance is to identify which are the hot regions
> and to measure their translation quality.
>
> QEMU is not different and it offers the ‘-d’ options which can dump the
> imputed assembly (guest binary) with ‘-d in_asm’, the generated TCG
> Intermediate Representation (IR, TCG ops) with ‘-d op_opt‘ and the final
> host assembly (target binary) with ‘-d out_asm’.
We actually have two stages:
-d op
These are the TCG ops as generated while decoding instructions. Some of
the higher level generators may generate different sequences if they
can detect a simple optimisation. For example tcg_gen_addi_i32 will
fold an addi reg,#0 to a simple move.
-d op_opt
These are the TCG ops after the optimisation pass (tcg_optimize). This
is where the code generator will attempt to improve the code for the
whole block by propagating constants and copies and folding constant
expressions.
--
Alex Bennée