[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] profiling qemu
From: |
Lluís Vilanova |
Subject: |
Re: [Qemu-devel] profiling qemu |
Date: |
Tue, 14 Feb 2012 14:53:58 +0100 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.0.93 (gnu/linux) |
Artyom Tarasenko writes:
[...]
> QEMU 1.0.50 monitor - type 'help' for more information
> (qemu) profile
> unknown command: 'profile'
> (qemu) info profile
> async time 38505498320 (38.505)
> qemu time 35947093161 (35.947)
> Is there a way to find out more?
Command "info jit" also has some information added when compiled with profiling
support.
Search for CONFIG_PROFILER to see which code is activated during profiling.
> Next I tried gprof:
> build-prof $ gprof sparc64-softmmu/qemu-system-sparc64 gmon.out
> Flat profile:
> Each sample counts as 0.01 seconds.
> % cumulative self self total
> time seconds seconds calls Ts/call Ts/call name
> 100.00 5.06 5.06 main
> Hmm. Not very informative. Is there a way to find out more details?
Did you run QEMU for a reasonable amount of time? gprof uses sampling to capture
its execution time statistics, so a small execution of QEMU will not be able to
capture any meaningful information.
[...]
> Here it looks like "compute_all_sub" and "compute_all_sub_xcc" are
> good candidates for optimizing: together they take the same amount of
> time as cpu_sparc_exec. I guess both operations would be trivial in
> the x86_64 assembler. What would be the best strategy to make TCG take
> the advantage of running on a x86_64 host?
A quick look into the code reveals that these two are called from a TCG helper
(helper_compute_psr), so I see two approaches here applicable to the most
frequently used "sub-operations" in helper_compute_psr:
* Define new simpler helpers for those sub-operations that can be declared with
TCG_CALL_CONST and generate the new psr/xcc values in temporal registers. You
must make sure any other code will still be able to use the new psr/xcc
values.
* Reimplement these sub-operations in pure TCG code.
But first, make sure you run a proper benchmark to establish where are the
hotspots in the sparc code for QEMU. The problem here is to establish what a
proper benchmark is :)
Lluis
--
"And it's much the same thing with knowledge, for whenever you learn
something new, the whole world becomes that much richer."
-- The Princess of Pure Reason, as told by Norton Juster in The Phantom
Tollbooth