qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] numa: warn if numa 'mem' option or default R


From: Daniel P . Berrangé
Subject: Re: [Qemu-devel] [PATCH v2] numa: warn if numa 'mem' option or default RAM splitting between nodes is used.
Date: Wed, 20 Mar 2019 11:51:51 +0000
User-agent: Mutt/1.11.3 (2019-02-01)

On Wed, Mar 20, 2019 at 11:26:34AM +0100, Igor Mammedov wrote:
> On Tue, 19 Mar 2019 14:51:07 +0000
> Daniel P. Berrangé <address@hidden> wrote:
> 
> > On Tue, Mar 19, 2019 at 02:08:01PM +0100, Igor Mammedov wrote:
> > > On Thu, 7 Mar 2019 10:07:05 +0000
> > > Daniel P. Berrangé <address@hidden> wrote:
> > >   
> > > > On Wed, Mar 06, 2019 at 07:54:17PM +0100, Igor Mammedov wrote:  
> > > > > On Wed, 6 Mar 2019 18:16:08 +0000
> > > > > Daniel P. Berrangé <address@hidden> wrote:
> > > > >     
> > > > > > On Wed, Mar 06, 2019 at 06:33:25PM +0100, Igor Mammedov wrote:    
> > > > > > > Amend -numa option docs and print warnings if 'mem' option or 
> > > > > > > default RAM
> > > > > > > splitting between nodes is used. It's intended to discourage 
> > > > > > > users from using
> > > > > > > configuration that allows only to fake NUMA on guest side while 
> > > > > > > leading
> > > > > > > to reduced performance of the guest due to inability to properly 
> > > > > > > configure
> > > > > > > VM's RAM on the host.
> > > > > > > 
> > > > > > > In NUMA case, it's recommended to always explicitly configure 
> > > > > > > guest RAM
> > > > > > > using -numa node,memdev={backend-id} option.
> > > > > > > 
> > > > > > > Signed-off-by: Igor Mammedov <address@hidden>
> > > > > > > ---
> > > > > > >  numa.c          |  5 +++++
> > > > > > >  qemu-options.hx | 12 ++++++++----
> > > > > > >  2 files changed, 13 insertions(+), 4 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/numa.c b/numa.c
> > > > > > > index 3875e1e..42838f9 100644
> > > > > > > --- a/numa.c
> > > > > > > +++ b/numa.c
> > > > > > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState *ms, 
> > > > > > > NumaNodeOptions *node,
> > > > > > >  
> > > > > > >      if (node->has_mem) {
> > > > > > >          numa_info[nodenr].node_mem = node->mem;
> > > > > > > +        warn_report("Parameter -numa node,mem is obsolete,"
> > > > > > > +                    " use -numa node,memdev instead");    
> > > > > > 
> > > > > > My comments from v1 still apply. We must not do this as long as
> > > > > > libvirt has no choice but to continue using this feature.    
> > > > > It has a choice to use 'memdev' whenever creating a new VM and 
> > > > > continue
> > > > > using 'mem' with exiting VMs.    
> > > > 
> > > > Unfortunately we don't have such a choice. Libvirt has no concept of the
> > > > distinction between an 'existing' and 'new' VM. It just receives an XML
> > > > file from the mgmt application and with transient guests, we have no
> > > > persistent configuration record of the VM. So we've no way of knowing
> > > > whether this VM was previously running on this same host, or another
> > > > host, or is completely new.  
> > > In case of transient VM, libvirt might be able to use machine version
> > > as deciding which option to use (memdev is around more than 4 years since 
> > > 2.1)
> > > (or QEMU could provide introspection into what machine version 
> > > (not)supports,
> > > like it was discussed before)
> > > 
> > > As discussed elsewhere (v1 tread|IRC), there are users (mainly CI) for 
> > > which
> > > fake NUMA is sufficient and they do not ask for explicit pinning, so 
> > > libvirt
> > > defaults to legacy -numa node,mem option.
> > > Those users do not care no aware that they should use memdev instead
> > > (I'm not sure if they are able to ask libvirt for non pinned numa memory
> > > which results in memdev being used).
> > > This patch doesn't obsolete anything yet, it serves purpose to inform 
> > > users
> > > that they are using legacy option and advises replacement option
> > > so that users would know to what they should adapt to.
> > > 
> > > Once we deprecate and then remove 'mem' for new machines only (while 
> > > keeping
> > > 'mem' working on old machine versions). The new nor old libvirt won't be 
> > > able
> > > to start new machine type with 'mem' option and have to use memdev 
> > > variant,
> > > so we don't have migration issues with new machines and old ones continue
> > > working with 'mem'.  
> > 
> > I'm not seeing what has changed which would enable us to deprecate
> > something only for new machines. That's not possible from libvirt's
> > POV as old libvirt will support new machines & thus we have to
> > continue using "mem" for all machines in the scenarios where we
> > currently use it. 
> There are several issues here:
>  1. how old libvirt you are talking about?

Any release prior to the one that changes the use of "mem".

IOW, if we changed "mem" in libvirt 5.2.0, then it would break compat
with libvirt 5.1.0 from the previous month's release (and of course
all versions before 5.1.0 by implication).

>  2. old libvirt + new QEMU won't be able to start QEMU with
>     new machine with 'mem' option so we don't have live migration,
>     it's rather management issue where mgmt should not try to migrate
>     to such host (if it manged to end up with not compatible package
>     bundle it is not QEMU nor libvirt problem per se).

I don't think this is a mgmt issue. When a new QEMU release comes out
it is valid to use it with an existing release of libvirt. You might
need new libvirt if you want to use new features from QEMU that didn't
exist previously, but existing QEMU features should generally work.

With QEMU's deprecation policy, you're not going to be able to use
arbitrarily old libvirt as at some point you will hit a version of
libvirt that uses the old deprecated approach, instead of the new
preferred approach. Whether this is a problem or not depends on
the features you are using too. eg if we a CLI arg with a new
preferred replacement, & you were never using that CLI arg in the
first place, the incompatibility doesn't affect you.

QEMU deprecation period is two releases, plus however long in the
dev cycle it was deprecated before release. In the best case, libvirt
from 12 months ago will have stopped using the deprecated feature.
In the worst case, where it is very hard to change libvirt, we might
still be using the deprecated feature right up until the end of the
deprecation period. That should be the exception & we try not to get
into that case as it is painful for users to deploy a new QEMU and
find it breaks with their intsalled libvirt.

>  3. in generic dropping features per machine or for all machines at once
>     is the same, since there would be old libvirt that uses removed
>     CLI option and it won't be able to start new QEMU with that option,
>     even worse it would affect all machines. So we should agree on new
>     reasonable deprecation period (if current one isn't sufficient)
>     that would allow users to adapt to a breaking change.

If a feature is completely dropped by QEMU with no replacement, there's
nothing libvirt can do to preserve existing VMs that use that feature.
Obviously this is painful for users, so QEMU doesn't do that without
compelling reason, such as the feature being unfixably broken.

This isn't the case with "mem" though - it is an existing feature
whose impl is being changed for a different impl. We're just telling
apps to change the way they imple the feature from "mem" to "memdev",
which breaks live migration compat across whichever version of the
app makes the change.

>  3. in case of downstream, it ships a compatible bundle and if user installs
>     a QEMU from newer release without other new bits it would fall under
>     unsupported category and the first thing support would tell to update
>     other part along with QEMU. What I'm saying it's downstream distro job
>     to organize upgrade path/track dependencies and backport/invent compat
>     layer to earlier releases if necessary.

In terms of preserving back compat, the distro's hands are tied by what
the upstream QEMU does to some extent. If upstream rips out the infra
needed to provide the back compat in the distro, they'll have to revert
all those upstream changes which can be non-trivial. Considering the
distro maintainers are often upstream maintainers too, that's not a net
win. The maintainer has saved themselves some work upstream, but created
themselves a bigger amount of work downstream.

>     So it's rather questionable if we should care about arbitrarily old
>     libvirt with new QEMU in case of new machines (especially upstream).

As noted above, with the deprecation feature policy new QEMU is not
likely to be compatible with arbitrarily old libvirt, but can usually
be expected to be compatible with upto 12 month old libvirt in the
best case, unless libvirt is really slow at adapting to deprecation
warnings.

So the challenge with tieing it to the new QEMU machine type is that
machine type is potentially used by a libvirt upto perhaps 12 months
old.

Somehow the older libvirt has to know to use the new QEMU feature
"memdev" that wasn't present required for any of the machine types
it knew about when it was first released.


This could be solved if QEMU has some machine type based property
that indicates whether "memdev" is required for a given machine,
but crucially *does not* actually activate that property until
several releases later.

We're too late for 4.0, so lets consider QEMU 4.1 as the
next release of QEMU, which opens for dev in April 2019.

QEMU 4.1 could introduce a machine type property "requires-memdev"
which defaults to "false" for all existing machine types. It
could add a deprecation that says a *future* machine type will
report "requires-memdev=true".  IOW,  "pc-i440fx-4.1" and
"pc-i440fx-4.2 must still report "requires-memdev=false", 

Libvirt 5.4.0 (May 2019) can now add support for "requires-memdev"
property. This would be effectively a no-op at time of this libvirt
release, since no QEMU would be reporting "requires-memdev=true" 
for many months to come yet.

Now, after 2 QEMU releases with the deprecation wawrning, when
the QEMU 5.0.0 dev cycle opens in Jan 2020, the new "pc-i440fx-5.0"
 machine type can be made to report "requires-memdev=true".

IOW, in April 2020 when QEMU 5.0.0 comes out, "mem" would no
longer be supported for new machine types. Libvirt at this
time would be upto 6.4.0 but that's co-incidental since it
would already be doing the right thing since 5.4.0.

IOW, this QEMU 5.0.0 would work correctly with libvirt versions
in the range 5.4.0 to 6.4.0 (and future).

If a user had libvirt < 5.4.0 (ie older than May 2019) nothing
would stop them using the "pc-i440fx-5.0" machine type, but
libvirt would be liable to use "mem" instead of "memdev" and
if that happened they would be unable to live migrate to a
host newer libvirt which honours "requires-memdev=true"


So in summary the key to being able to tie deprecations to machine 
type versions, is for QEMU to add a mechanism to report the desired 
new feature usage approach against the machine type, but then ensure
the mechanism continues to report the old approach for 2 more releases.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



reply via email to

[Prev in Thread] Current Thread [Next in Thread]