[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v2] rtc: placing RTC memory region outside BQL
From: |
Gonglei (Arei) |
Subject: |
Re: [Qemu-devel] [PATCH v2] rtc: placing RTC memory region outside BQL |
Date: |
Fri, 9 Feb 2018 10:05:03 +0000 |
>
> > >
> > > $ cat strace_c.sh
> > > strace -tt -p $1 -c -o result_$1.log &
> > > sleep $2
> > > pid=$(pidof strace)
> > > kill $pid
> > > cat result_$1.log
> > >
> > > Before appling this change:
> > > $ ./strace_c.sh 10528 30
> > > % time seconds usecs/call calls errors syscall
> > > ------ ----------- ----------- --------- --------- ----------------
> > > 93.87 0.119070 30 4000 ppoll
> > > 3.27 0.004148 2 2038 ioctl
> > > 2.66 0.003370 2 2014 futex
> > > 0.09 0.000113 1 106 read
> > > 0.09 0.000109 1 104 io_getevents
> > > 0.02 0.000029 1 30 poll
> > > 0.00 0.000000 0 1 write
> > > ------ ----------- ----------- --------- --------- ----------------
> > > 100.00 0.126839 8293 total
> > >
> > > After appling the change:
> > > $ ./strace_c.sh 23829 30
> > > % time seconds usecs/call calls errors syscall
> > > ------ ----------- ----------- --------- --------- ----------------
> > > 92.86 0.067441 16 4094 ppoll
> > > 4.85 0.003522 2 2136 ioctl
> > > 1.17 0.000850 4 189 futex
> > > 0.54 0.000395 2 202 read
> > > 0.52 0.000379 2 202 io_getevents
> > > 0.05 0.000037 1 30 poll
> > > ------ ----------- ----------- --------- --------- ----------------
> > > 100.00 0.072624 6853 total
> > >
> > > The futex call number decreases ~90.6% on an idle windows 7 guest.
> >
> > These are the same figures as from v1 -- it would be interesting
> > to check whether the additional locking that v2 adds has affected
> > the results.
> >
> Oh, yes. the futex number of v2 don't decline compared too much to v1 because
> it
> takes the BQL before raising the outbound IRQ line now.
>
> Before applying v2:
> # ./strace_c.sh 8776 30
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 78.01 0.164188 26 6436 ppoll
> 8.39 0.017650 5 3700 39 futex
> 7.68 0.016157 6 2758 ioctl
> 5.48 0.011530 3 4586 1113 read
> 0.30 0.000640 20 32 io_submit
> 0.15 0.000317 4 89 write
> ------ ----------- ----------- --------- --------- ----------------
> 100.00 0.210482 17601 1152 total
>
> After applying v2:
> # ./strace_c.sh 15968 30
> % time seconds usecs/call calls errors syscall
> ------ ----------- ----------- --------- --------- ----------------
> 78.28 0.171117 27 6272 ppoll
> 8.50 0.018571 5 3663 21 futex
> 7.76 0.016973 6 2732 ioctl
> 4.85 0.010597 3 4115 853 read
> 0.31 0.000672 11 63 io_submit
> 0.30 0.000659 4 180 write
> ------ ----------- ----------- --------- --------- ----------------
> 100.00 0.218589 17025 874 total
>
> > Does the patch improve performance in a more interesting use
> > case than "the guest is just idle" ?
> >
> I think so, after all, the scope of the locking is reduced .
> Besides this, can we optimize the rtc timer to avoid to hold BQL
> by separate threads?
>
Hi Peter, Paolo
I tested PCMark 8 (https://www.futuremark.com/benchmarks/pcmark)
in win7 guest and got the below results:
Guest: 2U2G
Before applying v2:
Your Work 2.0 score: 2000
Web Browsing - JunglePin 0.334s
Web Browsing - Amazonia 0.132s
Writing 3.59s
Spreadsheet 70.13s
Video Chat v2/Video Chat playback 1 v2 22.8 fps
Video Chat v2/Video Chat encoding v2 307.0 ms
Benchmark duration 1h 35min 46s
After applying v2:
Your Work 2.0 score: 2040
Web Browsing - JunglePin 0.345s
Web Browsing - Amazonia 0.132s
Writing 3.56s
Spreadsheet 67.83s
Video Chat v2/Video Chat playback 1 v2 28.7 fps
Video Chat v2/Video Chat encoding v2 324.7 ms
Benchmark duration 1h 32min 5s
Test results show that optimization is very effective in stressful situations.
Thanks,
-Gonglei