qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: starting to look at qemu savevm performance, a first regression dete


From: Daniel P . Berrangé
Subject: Re: starting to look at qemu savevm performance, a first regression detected
Date: Mon, 7 Mar 2022 12:20:17 +0000
User-agent: Mutt/2.1.5 (2021-12-30)

On Mon, Mar 07, 2022 at 01:09:55PM +0100, Claudio Fontana wrote:
> On 3/7/22 1:00 PM, Daniel P. Berrangé wrote:
> > On Mon, Mar 07, 2022 at 12:19:22PM +0100, Claudio Fontana wrote:
> >> On 3/7/22 10:51 AM, Daniel P. Berrangé wrote:
> >>> On Mon, Mar 07, 2022 at 10:44:56AM +0100, Claudio Fontana wrote:
> >>>> Hello Daniel,
> >>>>
> >>>> On 3/7/22 10:27 AM, Daniel P. Berrangé wrote:
> >>>>> On Sat, Mar 05, 2022 at 02:19:39PM +0100, Claudio Fontana wrote:
> >>>>>>
> >>>>>> Hello all,
> >>>>>>
> >>>>>> I have been looking at some reports of bad qemu savevm performance in 
> >>>>>> large VMs (around 20+ Gb),
> >>>>>> when used in libvirt commands like:
> >>>>>>
> >>>>>>
> >>>>>> virsh save domain /dev/null
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> I have written a simple test to run in a Linux centos7-minimal-2009 
> >>>>>> guest, which allocates and touches 20G mem.
> >>>>>>
> >>>>>> With any qemu version since around 2020, I am not seeing more than 580 
> >>>>>> Mb/Sec even in the most ideal of situations.
> >>>>>>
> >>>>>> This drops to around 122 Mb/sec after commit: 
> >>>>>> cbde7be900d2a2279cbc4becb91d1ddd6a014def .
> >>>>>>
> >>>>>> Here is the bisection for this particular drop in throughput:
> >>>>>>
> >>>>>> commit cbde7be900d2a2279cbc4becb91d1ddd6a014def (HEAD, refs/bisect/bad)
> >>>>>> Author: Daniel P. Berrangé <berrange@redhat.com>
> >>>>>> Date:   Fri Feb 19 18:40:12 2021 +0000
> >>>>>>
> >>>>>>     migrate: remove QMP/HMP commands for speed, downtime and cache size
> >>>>>>     
> >>>>>>     The generic 'migrate_set_parameters' command handle all types of 
> >>>>>> param.
> >>>>>>     
> >>>>>>     Only the QMP commands were documented in the deprecations page, 
> >>>>>> but the
> >>>>>>     rationale for deprecating applies equally to HMP, and the 
> >>>>>> replacements
> >>>>>>     exist. Furthermore the HMP commands are just shims to the QMP 
> >>>>>> commands,
> >>>>>>     so removing the latter breaks the former unless they get 
> >>>>>> re-implemented.
> >>>>>>     
> >>>>>>     Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >>>>>>     Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> >>>>>
> >>>>> That doesn't make a whole lot of sense as a bisect result.
> >>>>> How reliable is that bisect end point ? Have you bisected
> >>>>> to that point more than once ?
> >>>>
> >>>> I did run through the bisect itself only once, so I'll double check that.
> >>>> The results seem to be reproducible almost to the second though, a 
> >>>> savevm that took 35 seconds before the commit takes 2m 48 seconds after.
> >>>>
> >>>> For this test I am using libvirt v6.0.0.
> > 
> > I've just noticed this.  That version of libvirt is 2 years old and
> > doesn't have full support for migrate_set_parameters.
> > 
> > 
> >> 2022-03-07 10:47:20.145+0000: 134386: info : qemuMonitorIOWrite:452 : 
> >> QEMU_MONITOR_IO_WRITE: mon=0x7fa4380028a0 
> >> buf={"execute":"migrate_set_speed","arguments":{"value":9223372036853727232},"id":"libvirt-19"}^M
> >>  len=93 ret=93 errno=0
> >> 2022-03-07 10:47:20.146+0000: 134386: info : 
> >> qemuMonitorJSONIOProcessLine:240 : QEMU_MONITOR_RECV_REPLY: 
> >> mon=0x7fa4380028a0 reply={"id": "libvirt-19", "error": {"class": 
> >> "CommandNotFound", "desc": "The command migrate_set_speed has not been 
> >> found"}}
> >> 2022-03-07 10:47:20.147+0000: 134391: error : 
> >> qemuMonitorJSONCheckError:412 : internal error: unable to execute QEMU 
> >> command 'migrate_set_speed': The command migrate_set_speed has not been 
> >> found
> > 
> > We see the migrate_set_speed failing and libvirt obviously ignores that
> > failure.
> > 
> > In current libvirt migrate_set_speed is not used as it properly
> > handles migrate_set_parameters AFAICT.
> > 
> > I think you just need to upgrade libvirt if you want to use this
> > newer QEMU version
> > 
> > Regards,
> > Daniel
> > 
> 
> Got it, this explains it, sorry for the noise on this.
> 
> I'll continue to investigate the general issue of low throughput with virsh 
> save / qemu savevm .

BTW, consider measuring with the --bypass-cache flag to virsh save.
This causes libvirt to use a I/O helper that uses O_DIRECT when
saving the image. This should give more predictable results by
avoiding the influence of host I/O cache which can be in a differnt
state of usage each time you measure.  It was also intended that
by avoiding hitting cache, saving the memory image of a large VM
will not push other useful stuff out of host I/O  cache which can
negatively impact other running VMs.

Also it is possible to configure compression on the libvirt side
which may be useful if you have spare CPU cycles, but your storage
is slow. See 'save_image_format' in the /etc/libvirt/qemu.conf

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]