[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Proposal for a regular upstream performance testing
From: |
Stefan Hajnoczi |
Subject: |
Re: Proposal for a regular upstream performance testing |
Date: |
Tue, 22 Mar 2022 15:05:19 +0000 |
On Mon, Mar 21, 2022 at 11:29:42AM +0100, Lukáš Doktor wrote:
> Hello Stefan,
>
> Dne 21. 03. 22 v 10:42 Stefan Hajnoczi napsal(a):
> > On Mon, Mar 21, 2022 at 09:46:12AM +0100, Lukáš Doktor wrote:
> >> Dear qemu developers,
> >>
> >> you might remember the "replied to" email from a bit over year ago to
> >> raise a discussion about a qemu performance regression CI. On KVM forum I
> >> presented
> >> https://www.youtube.com/watch?v=Cbm3o4ACE3Y&list=PLbzoR-pLrL6q4ZzA4VRpy42Ua4-D2xHUR&index=9
> >> some details about my testing pipeline. I think it's stable enough to
> >> become part of the official CI so people can consume, rely on it and
> >> hopefully even suggest configuration changes.
> >>
> >> The CI consists of:
> >>
> >> 1. Jenkins pipeline(s) - internal, not available to developers, running
> >> daily builds of the latest available commit
> >> 2. Publicly available anonymized results:
> >> https://ldoktor.github.io/tmp/RedHat-Perf-worker1/
> >
> > This link is 404.
> >
>
> My mistake, it works well without the tailing slash:
> https://ldoktor.github.io/tmp/RedHat-Perf-worker1
>
> >> 3. (optional) a manual gitlab pulling job which triggered by the Jenkins
> >> pipeline when that particular commit is checked
> >>
> >> The (1) is described here:
> >> https://run-perf.readthedocs.io/en/latest/jenkins.html and can be
> >> replicated on other premises and the individual jobs can be executed
> >> directly https://run-perf.readthedocs.io on any linux box using Fedora
> >> guests (via pip or container
> >> https://run-perf.readthedocs.io/en/latest/container.html ).
> >>
> >> As for the (3) I made a testing pipeline available here:
> >> https://gitlab.com/ldoktor/qemu/-/pipelines with one always-passing test
> >> and one allow-to-fail actual testing job. If you think such integration
> >> would be useful, I can add it as another job to the official qemu repo.
> >> Note the integration is a bit hacky as, due to resources, we can not test
> >> all commits but rather test on daily basis, which is not officially
> >> supported by gitlab.
> >>
> >> Note the aim of this project is to ensure some very basic system-level
> >> workflow performance stays the same or that the differences are described
> >> and ideally pinned to individual commits. It should not replace thorough
> >> release testing or low-level performance tests.
> >
> > If I understand correctly the GitLab CI integration you described
> > follows the "push" model where Jenkins (running on your own machine)
> > triggers a manual job in GitLab CI simply to indicate the status of the
> > nightly performance regression test?
> >
> > What process should QEMU follow to handle performance regressions
> > identified by your job? In other words, which stakeholders need to
> > triage, notify, debug, etc when a regression is identified?
> >
> > My guess is:
> > - Someone (you or the qemu.git committer) need to watch the job status and
> > triage failures.
> > - That person then notifies likely authors of suspected commits so they can
> > investigate.
> > - The authors need a way to reproduce the issue - either locally or by
> > pushing commits to GitLab and waiting for test results.
> > - Fixes will be merged as additional qemu.git commits since commit history
> > cannot be rewritten.
> > - If necessary a git-revert(1) commit can be merged to temporarily undo a
> > commit that caused issues.
> >
> > Who will watch the job status and triage failures?
> >
> > Stefan
>
> This is exactly the main question I'd like to resolve as part of
> considering-this-to-be-official-part-of-the-upstream-qemu-testing. At this
> point our team is offering it's service to maintain this single worker for
> daily jobs, monitoring the status and pinging people in case of bisectable
> results.
That's great! The main hurdle is finding someone to triage regressions
and if you are volunteering to do that then these regression tests would
be helpful to QEMU.
> From the upstream qemu community we are mainly looking for a feedback:
>
> * whether they'd want to be notified of such issues (and via what means)
I have CCed Kevin Wolf in case he has any questions regarding how fio
regressions will be handled.
I'm happy to be contacted when a regression bisects to a commit I
authored.
> * whether the current approach seems to be actually performing useful tasks
> * whether the reports are understandable
Reports aren't something I would look at as a developer. Although the
history and current status may be useful to some maintainers, that
information isn't critical. Developers simply need to know which commit
introduced a regression and the details of how to run the regression.
> * whether the reports should be regularly pushed into publicly available
> place (or just on regression/improvement)
> * whether there are any volunteers to be interested in non-clearly-bisectable
> issues (probably by-topic)
One option is to notify maintainers, but when I'm in this position
myself I usually only investigate critical issues due to limited time.
Regarding how to contact people, I suggest emailing them and CCing
qemu-devel so others are aware.
Thanks,
Stefan
signature.asc
Description: PGP signature
- Re: Proposal for a regular upstream performance testing,
Stefan Hajnoczi <=