qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2] numa: warn if numa 'mem' option or default R


From: Igor Mammedov
Subject: Re: [Qemu-devel] [PATCH v2] numa: warn if numa 'mem' option or default RAM splitting between nodes is used.
Date: Wed, 20 Mar 2019 16:20:19 +0100

On Wed, 20 Mar 2019 11:51:51 +0000
Daniel P. Berrangé <address@hidden> wrote:

> On Wed, Mar 20, 2019 at 11:26:34AM +0100, Igor Mammedov wrote:
> > On Tue, 19 Mar 2019 14:51:07 +0000
> > Daniel P. Berrangé <address@hidden> wrote:
> >   
> > > On Tue, Mar 19, 2019 at 02:08:01PM +0100, Igor Mammedov wrote:  
> > > > On Thu, 7 Mar 2019 10:07:05 +0000
> > > > Daniel P. Berrangé <address@hidden> wrote:
> > > >     
> > > > > On Wed, Mar 06, 2019 at 07:54:17PM +0100, Igor Mammedov wrote:    
> > > > > > On Wed, 6 Mar 2019 18:16:08 +0000
> > > > > > Daniel P. Berrangé <address@hidden> wrote:
> > > > > >       
> > > > > > > On Wed, Mar 06, 2019 at 06:33:25PM +0100, Igor Mammedov wrote:    
> > > > > > >   
> > > > > > > > Amend -numa option docs and print warnings if 'mem' option or 
> > > > > > > > default RAM
> > > > > > > > splitting between nodes is used. It's intended to discourage 
> > > > > > > > users from using
> > > > > > > > configuration that allows only to fake NUMA on guest side while 
> > > > > > > > leading
> > > > > > > > to reduced performance of the guest due to inability to 
> > > > > > > > properly configure
> > > > > > > > VM's RAM on the host.
> > > > > > > > 
> > > > > > > > In NUMA case, it's recommended to always explicitly configure 
> > > > > > > > guest RAM
> > > > > > > > using -numa node,memdev={backend-id} option.
> > > > > > > > 
> > > > > > > > Signed-off-by: Igor Mammedov <address@hidden>
> > > > > > > > ---
> > > > > > > >  numa.c          |  5 +++++
> > > > > > > >  qemu-options.hx | 12 ++++++++----
> > > > > > > >  2 files changed, 13 insertions(+), 4 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/numa.c b/numa.c
> > > > > > > > index 3875e1e..42838f9 100644
> > > > > > > > --- a/numa.c
> > > > > > > > +++ b/numa.c
> > > > > > > > @@ -121,6 +121,8 @@ static void parse_numa_node(MachineState 
> > > > > > > > *ms, NumaNodeOptions *node,
> > > > > > > >  
> > > > > > > >      if (node->has_mem) {
> > > > > > > >          numa_info[nodenr].node_mem = node->mem;
> > > > > > > > +        warn_report("Parameter -numa node,mem is obsolete,"
> > > > > > > > +                    " use -numa node,memdev instead");      
> > > > > > > 
> > > > > > > My comments from v1 still apply. We must not do this as long as
> > > > > > > libvirt has no choice but to continue using this feature.      
> > > > > > It has a choice to use 'memdev' whenever creating a new VM and 
> > > > > > continue
> > > > > > using 'mem' with exiting VMs.      
> > > > > 
> > > > > Unfortunately we don't have such a choice. Libvirt has no concept of 
> > > > > the
> > > > > distinction between an 'existing' and 'new' VM. It just receives an 
> > > > > XML
> > > > > file from the mgmt application and with transient guests, we have no
> > > > > persistent configuration record of the VM. So we've no way of knowing
> > > > > whether this VM was previously running on this same host, or another
> > > > > host, or is completely new.    
> > > > In case of transient VM, libvirt might be able to use machine version
> > > > as deciding which option to use (memdev is around more than 4 years 
> > > > since 2.1)
> > > > (or QEMU could provide introspection into what machine version 
> > > > (not)supports,
> > > > like it was discussed before)
> > > > 
> > > > As discussed elsewhere (v1 tread|IRC), there are users (mainly CI) for 
> > > > which
> > > > fake NUMA is sufficient and they do not ask for explicit pinning, so 
> > > > libvirt
> > > > defaults to legacy -numa node,mem option.
> > > > Those users do not care no aware that they should use memdev instead
> > > > (I'm not sure if they are able to ask libvirt for non pinned numa memory
> > > > which results in memdev being used).
> > > > This patch doesn't obsolete anything yet, it serves purpose to inform 
> > > > users
> > > > that they are using legacy option and advises replacement option
> > > > so that users would know to what they should adapt to.
> > > > 
> > > > Once we deprecate and then remove 'mem' for new machines only (while 
> > > > keeping
> > > > 'mem' working on old machine versions). The new nor old libvirt won't 
> > > > be able
> > > > to start new machine type with 'mem' option and have to use memdev 
> > > > variant,
> > > > so we don't have migration issues with new machines and old ones 
> > > > continue
> > > > working with 'mem'.    
> > > 
> > > I'm not seeing what has changed which would enable us to deprecate
> > > something only for new machines. That's not possible from libvirt's
> > > POV as old libvirt will support new machines & thus we have to
> > > continue using "mem" for all machines in the scenarios where we
> > > currently use it.   
> > There are several issues here:
> >  1. how old libvirt you are talking about?  
> 
> Any release prior to the one that changes the use of "mem".
> 
> IOW, if we changed "mem" in libvirt 5.2.0, then it would break compat
> with libvirt 5.1.0 from the previous month's release (and of course
> all versions before 5.1.0 by implication).
> 
> >  2. old libvirt + new QEMU won't be able to start QEMU with
> >     new machine with 'mem' option so we don't have live migration,
> >     it's rather management issue where mgmt should not try to migrate
> >     to such host (if it manged to end up with not compatible package
> >     bundle it is not QEMU nor libvirt problem per se).  
> 
> I don't think this is a mgmt issue. When a new QEMU release comes out
> it is valid to use it with an existing release of libvirt. You might
> need new libvirt if you want to use new features from QEMU that didn't
> exist previously, but existing QEMU features should generally work.
> 
> With QEMU's deprecation policy, you're not going to be able to use
> arbitrarily old libvirt as at some point you will hit a version of
> libvirt that uses the old deprecated approach, instead of the new
> preferred approach. Whether this is a problem or not depends on
> the features you are using too. eg if we a CLI arg with a new
> preferred replacement, & you were never using that CLI arg in the
> first place, the incompatibility doesn't affect you.
> 
> QEMU deprecation period is two releases, plus however long in the
> dev cycle it was deprecated before release. In the best case, libvirt
> from 12 months ago will have stopped using the deprecated feature.
> In the worst case, where it is very hard to change libvirt, we might
> still be using the deprecated feature right up until the end of the
> deprecation period. That should be the exception & we try not to get
> into that case as it is painful for users to deploy a new QEMU and
> find it breaks with their intsalled libvirt.
> 
> >  3. in generic dropping features per machine or for all machines at once
> >     is the same, since there would be old libvirt that uses removed
> >     CLI option and it won't be able to start new QEMU with that option,
> >     even worse it would affect all machines. So we should agree on new
> >     reasonable deprecation period (if current one isn't sufficient)
> >     that would allow users to adapt to a breaking change.  
> 
> If a feature is completely dropped by QEMU with no replacement, there's
> nothing libvirt can do to preserve existing VMs that use that feature.
> Obviously this is painful for users, so QEMU doesn't do that without
> compelling reason, such as the feature being unfixably broken.
> 
> This isn't the case with "mem" though - it is an existing feature
> whose impl is being changed for a different impl. We're just telling
> apps to change the way they imple the feature from "mem" to "memdev",
> which breaks live migration compat across whichever version of the
> app makes the change.
> 
> >  3. in case of downstream, it ships a compatible bundle and if user installs
> >     a QEMU from newer release without other new bits it would fall under
> >     unsupported category and the first thing support would tell to update
> >     other part along with QEMU. What I'm saying it's downstream distro job
> >     to organize upgrade path/track dependencies and backport/invent compat
> >     layer to earlier releases if necessary.  
> 
> In terms of preserving back compat, the distro's hands are tied by what
> the upstream QEMU does to some extent. If upstream rips out the infra
> needed to provide the back compat in the distro, they'll have to revert
> all those upstream changes which can be non-trivial. Considering the
> distro maintainers are often upstream maintainers too, that's not a net
> win. The maintainer has saved themselves some work upstream, but created
> themselves a bigger amount of work downstream.
I don't agree with some above said points but I will put this discussion off
for later and jump strait down more technical part below.

> >     So it's rather questionable if we should care about arbitrarily old
> >     libvirt with new QEMU in case of new machines (especially upstream).  
> 
> As noted above, with the deprecation feature policy new QEMU is not
> likely to be compatible with arbitrarily old libvirt, but can usually
> be expected to be compatible with upto 12 month old libvirt in the
> best case, unless libvirt is really slow at adapting to deprecation
> warnings.
> 
> So the challenge with tieing it to the new QEMU machine type is that
> machine type is potentially used by a libvirt upto perhaps 12 months
> old.
Seems a bit much but if there is consensus I'd go with it,
at least it allows us to move forward in a year (when 'mem'
is banned on new machines)

> Somehow the older libvirt has to know to use the new QEMU feature
> "memdev" that wasn't present required for any of the machine types
> it knew about when it was first released.
> 
> 
> This could be solved if QEMU has some machine type based property
> that indicates whether "memdev" is required for a given machine,
> but crucially *does not* actually activate that property until
> several releases later.
> 
> We're too late for 4.0, so lets consider QEMU 4.1 as the
> next release of QEMU, which opens for dev in April 2019.
> 
> QEMU 4.1 could introduce a machine type property "requires-memdev"
> which defaults to "false" for all existing machine types. It
> could add a deprecation that says a *future* machine type will
> report "requires-memdev=true".  IOW,  "pc-i440fx-4.1" and
> "pc-i440fx-4.2 must still report "requires-memdev=false", 
> 
> Libvirt 5.4.0 (May 2019) can now add support for "requires-memdev"
> property. This would be effectively a no-op at time of this libvirt
> release, since no QEMU would be reporting "requires-memdev=true" 
> for many months to come yet.
> 
> Now, after 2 QEMU releases with the deprecation wawrning, when
> the QEMU 5.0.0 dev cycle opens in Jan 2020, the new "pc-i440fx-5.0"
>  machine type can be made to report "requires-memdev=true".
> 
> IOW, in April 2020 when QEMU 5.0.0 comes out, "mem" would
> no longer be supported for new machine types. Libvirt at this
  ^^^^^^^^^^^^^^^^^^^^^^^

> time would be upto 6.4.0 but that's co-incidental since it
> would already be doing the right thing since 5.4.0.
> 
> IOW, this QEMU 5.0.0 would work correctly with libvirt versions
> in the range 5.4.0 to 6.4.0 (and future).

> If a user had libvirt < 5.4.0 (ie older than May 2019) nothing
> would stop them using the "pc-i440fx-5.0" machine type, but
> libvirt would be liable to use "mem" instead of "memdev" and

> if that happened they would be unable to live migrate to a
> host newer libvirt which honours "requires-memdev=true"
I failed to parse this section in connection '^' underlined part,
I'm reading 'no longer be supported' as it's not possible to start
QEMU -M machine_foo.requires-memdev=true with 'mem' option.
Is it what you've meant?

> So in summary the key to being able to tie deprecations to machine 
> type versions, is for QEMU to add a mechanism to report the desired 
> new feature usage approach against the machine type, but then ensure
> the mechanism continues to report the old approach for 2 more releases.

so that makes QEMU deprecation period effectively 3 releases (assuming 4 months 
cadence).


> 
> Regards,
> Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]