Proposal for a regular upstream performance testing

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Proposal for a regular upstream performance testing

From:	Lukáš Doktor
Subject:	Proposal for a regular upstream performance testing
Date:	Thu, 26 Nov 2020 09:10:14 +0100
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1

Hello guys,

I had been around qemu on the Avocado-vt side for quite some time and a while 
ago I shifted my focus on performance testing. Currently I am not aware of any 
upstream CI that would continuously monitor the upstream qemu performance and 
I'd like to change that. There is a lot to cover so please bear with me.

Goal
====

The goal of this initiative is to detect system-wide performance regressions as 
well as improvements early, ideally pin-point the individual commits and notify 
people that they should fix things. All in upstream and ideally with least 
human interaction possible.

Unlike the recent work of Ahmed Karaman's 
https://ahmedkrmn.github.io/TCG-Continuous-Benchmarking/ my aim is on the 
system-wide performance inside the guest (like fio, uperf, ...)

Tools
=====

In house we have several different tools used by various teams and I bet there 
are tons of other tools out there that can do that. I can not speak for all 
teams but over the time many teams at Red Hat have come to like pbench 
https://distributed-system-analysis.github.io/pbench/ to run the tests and 
produce machine readable results and use other tools (Ansible, scripts, ...) to 
provision the systems and to generate the comparisons.

As for myself I used python for PoC and over the last year I pushed hard to 
turn it into a usable and sensible tool which I'd like to offer: 
https://run-perf.readthedocs.io/en/latest/ anyway I am open to suggestions and 
comparisons. As I am using it downstream to watch regressions I do plan on keep 
developing the tool as well as the pipelines (unless a better tool is found 
that would replace it or it's parts).

How
===

This is a tough question. Ideally this should be a standalone service that 
would only notify the author of the patch that caused the change with a bunch 
of useful data so they can either address the issue or just be aware of this 
change and mark it as expected.

Ideally the community should have a way to also issue their custom builds in 
order to verify their patches so they can debug and address issues better than 
just commit to qemu-master.

The problem with those is that we can not simply use travis/gitlab/... machines 
for running those tests, because we are measuring in-guest actual performance. 
We can't just stop the time when the machine decides to schedule another 
container/vm. I briefly checked the public bare-metal offerings like rackspace 
but these are most probably not sufficient either because (unless I'm wrong) 
they only give you a machine but it is not guaranteed that it will be the same 
machine the next time. If we are to compare the results we don't need just the 
same model, we really need the very same machine. Any change to the machine 
might lead to a significant difference (disk replacement, even firmware 
update...).

Solution 1
----------

Doing this for downstream builds I can start doing this for upstream as well. 
At this point I can offer a single pipeline watching only changes in qemu 
(downstream we are checking distro/kernel changes as well but that would 
require too much time at this point) on a single x86_64 machine. I can not 
offer a public access to the testing machine, not even checking custom builds 
(unless someone provides me a publicly available machine(s) that I would use 
for this). What I can offer is running the checks on the latest qemu master, 
publishing the reports, bisecting issues and notifying people about the 
changes. An example of a report can be found here: 
https://drive.google.com/file/d/1V2w7QpSuybNusUaGxnyT5zTUvtZDOfsb/view?usp=sharing
 a documentation of the format is here: 
https://run-perf.readthedocs.io/en/latest/scripts.html#html-results I can also 
attach the raw pbench results if needed (as well as details about the tests 
that were executed and the params and other details).

Currently the covered scenarios would be a default libvirt machine with qcow2 
storage and tuned libvirt machine (cpus, hugepages, numa, raw disk...) running 
fio, uperf and linpack on the latest GA RHEL. In the future I can add/tweak the 
scenarios as well as tests selection based on your feedback.

Solution 2
----------

I can offer a documentation: 
https://run-perf.readthedocs.io/en/latest/jenkins.html and someone can 
fork/inspire by it and setup the pipelines on their system, making it available 
to the outside world, add your custom scenarios and variants. Note the setup 
does not require Jenkins, it's just an example and could be easily turned into 
a cronjob or whatever you chose.

Solution 3
----------

You name it. I bet there are many other ways to perform system-wide performance 
testing.

Regards,
Lukáš

[Prev in Thread]

Current Thread

[Next in Thread]

Proposal for a regular upstream performance testing, Lukáš Doktor <=
- Re: Proposal for a regular upstream performance testing, Jason Wang, 2020/11/26
- Re: Proposal for a regular upstream performance testing, Daniel P . Berrangé, 2020/11/26
  - Re: Proposal for a regular upstream performance testing, Lukáš Doktor, 2020/11/26
  - Re: Proposal for a regular upstream performance testing, Stefan Hajnoczi, 2020/11/30
- Re: Proposal for a regular upstream performance testing, Peter Maydell, 2020/11/26
  - Re: Proposal for a regular upstream performance testing, Lukáš Doktor, 2020/11/26
- Re: Proposal for a regular upstream performance testing, Stefan Hajnoczi, 2020/11/30

Prev by Date: [Bug 1821054] Re: qemu segfault error when using pcie to dual pci adapter
Next by Date: Re: [PULL 0/5] Final (?) batch of misc patches for QEMU 5.2
Previous by thread: [PULL 0/5] Final (?) batch of misc patches for QEMU 5.2
Next by thread: Re: Proposal for a regular upstream performance testing
Index(es):
- Date
- Thread