Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements

From:	Alex Bennée
Subject:	Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements
Date:	Mon, 04 Aug 2014 12:34:22 +0100

Alex Bennée writes:

> Paolo Bonzini writes:
>
>> Il 30/07/2014 17:20, Alex Bennée ha scritto:
>>> Hi,
>>> 
> <snip>
>>> The most important thing is I've measured a 25-30% improvement in
>>> kernel and android boot time.
>>> 
> <snip>
>> Hi Alex, have you seen this patch?  Perhaps you're interested in
>> reviving it.
>>
>> http://article.gmane.org/gmane.comp.emulators.qemu/253864
>
> I saw it when it first came out but I didn't quite follow what it was
> doing as I hadn't looked at the TLB code. I'll have another look and see
> what difference it can make.

A quick and dirty benchmark:

**** Comparing 10bit/12bit tables with and without 
[[http://article.gmane.org/gmane.comp.emulators.qemu/253864][victim cache]]

#+BEGIN_NOTES
Time in seconds, smaller is better
Percentage is amount of time compared to run to the left
#+END_NOTES

| Code  |   10 bit | 10 bit + victim |    12 bit | 12 bit + victim |
|-------+----------+-----------------+-----------+-----------------|
|       |   12.783 |          11.664 |    10.348 |           9.527 |
| Runs  |   13.046 |          11.971 |    10.123 |           9.326 |
|       |   12.929 |          11.673 |    11.130 |           9.858 |
|       |   12.981 |          11.941 |    10.223 |           9.673 |
|-------+----------+-----------------+-----------+-----------------|
| Avgs  | 12.93475 |        11.81225 |    10.456 |           9.596 |
|-------+----------+-----------------+-----------+-----------------|
| %prev |     100% |       91.321827 | 88.518276 |       91.775057 |
#+TBLFM: 
$2=vmean(@I..II)::$3=(@II$3/@II$2)*100::$4=vmean(@I..II)::$5=vmean(@I..II)

Which as you expect shows the page table size is a greater improvement
to the performance but the victim cache also improves the run time on
top of this.

I say as you would expect because any time you need to exit translated
code there is a bunch of overhead in doing so.

-- 
Alex Bennée

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Peter Maydell, 2014/08/01
- Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Peter Maydell, 2014/08/01
  - Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Alex Bennée, 2014/08/04
    - Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Peter Maydell, 2014/08/04
    - Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Christopher Covington, 2014/08/04
  - Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Richard Henderson, 2014/08/06
- Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Paolo Bonzini, 2014/08/01
  - Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Alex Bennée, 2014/08/04
    - Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements, Alex Bennée <=

Prev by Date: Re: [Qemu-devel] [PATCH 1/2] virtio-serial: create a linked list of all active devices
Next by Date: Re: [Qemu-devel] [PATCH 07/15] dataplane: use object pool to speed up allocation for virtio blk request
Previous by thread: Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements
Next by thread: [Qemu-devel] [PATCH v4 0/2] a few simple trace fixes
Index(es):
- Date
- Thread