qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] profiling qemu


From: Artyom Tarasenko
Subject: [Qemu-devel] profiling qemu
Date: Tue, 14 Feb 2012 11:33:18 +0100

On a x86_64 host a sparc64 emulation feels quite slower than sparc32.
I tried to find out what can be optimized and here are some questions.
First of all, it's not clear how to do it in the current git:

build-prof $   ../qemu/configure --target-list=sparc64-softmmu
--enable-gprof --enable-profiler
[...]
host CPU          x86_64
host big endian   no
target list       sparc64-softmmu
tcg debug enabled no
Mon debug enabled no
gprof enabled     yes
sparse enabled    no
strip binaries    yes
profiler          yes
[...]

build-prof $ sparc64-softmmu/qemu-system-sparc64 -nographic -profile
-profile: invalid option

If I launch qemu without -profile option, it starts but,

QEMU 1.0.50 monitor - type 'help' for more information
(qemu) profile
unknown command: 'profile'
(qemu) info profile
async time  38505498320 (38.505)
qemu time   35947093161 (35.947)

Is there a way to find out more?

Next I tried gprof:

build-prof $  gprof sparc64-softmmu/qemu-system-sparc64 gmon.out
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
100.00      5.06     5.06                             main

Hmm. Not very informative. Is there a way to find out more details?


A pre-glib version used to give more information:

$ gprof sparc64-softmmu/qemu-system-sparc64 gmon.out
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
 14.78     24.68    24.68                             cpu_sparc_exec
  7.84     37.76    13.08                             compute_all_sub_xcc
  7.56     50.38    12.62                             compute_all_sub
  7.44     62.80    12.42                             helper_compute_psr
  6.41     73.50    10.70                             get_physical_address
  5.09     82.00     8.50                             compute_all_logic_xcc
  3.64     88.07     6.07                             tcg_optimize
  3.27     93.53     5.46                             temp_save
  2.43     97.59     4.06                             tcg_reg_alloc_op
  2.37    101.54     3.95                             compute_all_taddtv
  2.24    105.27     3.74                             compute_C_sub_xcc
  2.22    108.98     3.71                             tcg_liveness_analysis
  2.00    112.32     3.34                             compute_all_flags
  1.68    115.13     2.81                             tlb_flush

Here it looks like "compute_all_sub" and "compute_all_sub_xcc" are
good candidates for optimizing: together they take the same amount of
time as cpu_sparc_exec. I guess both operations would be trivial in
the x86_64 assembler. What would be the best strategy to make TCG take
the advantage of running on a x86_64 host?

-- 
Regards,
Artyom Tarasenko

solaris/sparc under qemu blog: http://tyom.blogspot.com/search/label/qemu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]