[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] invtsc + migration + TSC scaling
From: |
Eduardo Habkost |
Subject: |
Re: [Qemu-devel] invtsc + migration + TSC scaling |
Date: |
Wed, 19 Oct 2016 15:42:20 -0200 |
User-agent: |
Mutt/1.7.0 (2016-08-17) |
On Wed, Oct 19, 2016 at 05:42:16PM +0200, Radim Krčmář wrote:
> 2016-10-19 11:55-0200, Eduardo Habkost:
> > On Wed, Oct 19, 2016 at 03:27:52PM +0200, Radim Krčmář wrote:
> >> 2016-10-18 19:05-0200, Eduardo Habkost:
> >> > On Tue, Oct 18, 2016 at 10:52:14PM +0200, Radim Krčmář wrote:
> >> > [...]
> >> >> The main problem is that QEMU changes virtual_tsc_khz when migrating
> >> >> without hardware scaling, so KVM is forced to get nanoseconds wrong ...
> >> >>
> >> >> If QEMU doesn't want to keep the TSC frequency constant, then it would
> >> >> be better if it didn't expose TSC in CPUID -- guest would just use
> >> >> kvmclock without being tempted by direct TSC accesses.
> >> >
> >> > Isn't enough to simply not expose invtsc? Aren't guests expected
> >> > to assume the TSC frequency can change if invtsc isn't set on
> >> > CPUID?
> >>
> >> There are exceptions. An OS can assume constant TSC on some models that
> >> QEMU emulates: coreduo, core2duo, Conroe, Penryn, n270, kvm32 and kvm64.
> >> The list from SDM (17.15 TIME-STAMP COUNTER):
> >>
> >> Pentium 4 processors, Intel Xeon processors (family [0FH], models [03H
> >> and higher]); Intel Core Solo and Intel Core Duo processors (family
> >> [06H], model [0EH]); the Intel Xeon processor 5100 series and Intel
> >> Core 2 Duo processors (family [06H], model [0FH]); Intel Core 2 and
> >> Intel Xeon processors (family [06H], DisplayModel [17H]); Intel Atom
> >> processors (family [06H], DisplayModel [1CH]))
> >>
> >> Another sad part is that Linux uses the following condition to assume
> >> constant TSC frequency:
> >>
> >> if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
> >> (c->x86 == 0x6 && c->x86_model >= 0x0e))
> >> set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
> >>
> >> which returns sets constant TSC for all modern processors. It's not a
> >> problem on real hardware, because all modern processors likely have
> >> invariant TSC.
> >>
> >> Fun fact: Linux shows constant_tsc flag in /proc/cpuinfo even if the
> >> modern CPU doesn't expose TSC in CPUID.
> >>
> >> Considering that Linux is fixed on Nehalem and newer processors, we have
> >> few options for the rest:
> >> 1) treat TSC like invariant TSC on those models (the guest cannot use
> >> ACPI state, so its OS might assume that they are equivalent)
> >> 2) hide TSC on those models
> >> 3) ignore the problem
> >> 4) remove those models
> >>
> >> I don't know enough about QEMU design goals to guess which one is the
> >> most appropriate. (4) is the clear winner for me, followed by (3). :)
> >
> > (4) can't be implemented because it breaks existing
> > configurations. (3) is the current solution.
>
> Existing machine types must remain compatible, but isn't it possible to
> cull options in new machine types?
We specifically promised to libvirt developers that a CPU model
that can be started with a machine-type should be still runnable
with other versions of the same machine-type family. In other
words, a running config should keep working if only the
machine-type version changed.
>
> > Option (2) sounds attractive to me, but seems risky.
>
> Definitely.
> If users have a setup that works, then any change can break it.
>
> It would be the best option few years back when we wrote the code, but
> now the change will happen *in* the guest, so we can't control it as in
> the case of (4), where broken guests won't start, or (1), where broken
> guests won't migrate.
>
> > I would like
> > to understand the consequences for guests. What could stop
> > working if we remove TSC? What about kvmclock?
>
> Hiding TSC in CPUID doesn't disable the RDTSC instruction in the guest.
>
> kvmclock is a paravirtual device on top of TSC, so if kvmclock is
> present, then it should be safe to assume that the guest can use TSC for
> operations with kvmclock.
> Linux does that, but I don't think this behavior was ever written down,
> so other kvmclock users could break.
>
> Maybe Hyper-V TSC page would stop working, because Windows and other
> users could have a check for CPUID.1:EDX.TSC separately.
> Linux's implemention would work, because it just checks for the
> paravirtual feature, like in case of kvmclock.
>
> And minor cases are: an OS that has no other option that TSC for clock;
> userspace that checks TSC before using it; an OS that stops setting
> CR4.TSD and its userspace starts to use TSC; and probably many others.
OK, that sounds very risky. This means it is probably better to
let management software explicitly choose the new stricter
behavior.
...and we already have a mechanism to request stricter behavior:
explicitly disabling TSC, or setting tsc-frequency explicitly on
the command-line.
>
> > If we implement (2), we could even add an extra check that blocks
> > migration (or at least prints a warning) in case:
> > 1) TSC is forcibly enabled in the configuration;
> > 2) TSC scaling is not available on destination; and
> > 3) the family/model values match the ones on the list above.
> >
> > And we could even keep TSC enabled by default for users who don't
> > want migration (using migratable=false).
>
> That would be nice.
We already print a warning if there's TSC frequency mismatch
without TSC scaling. I wonder if we should reduce false positives
by printing it only when family/model is on the list above (or if
invtsc is enabled).
--
Eduardo
- Re: [Qemu-devel] invtsc + migration + TSC scaling, (continued)
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Paolo Bonzini, 2016/10/17
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Eduardo Habkost, 2016/10/17
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Marcelo Tosatti, 2016/10/17
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Paolo Bonzini, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Marcelo Tosatti, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Eduardo Habkost, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/19
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Eduardo Habkost, 2016/10/19
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/19
- Re: [Qemu-devel] invtsc + migration + TSC scaling,
Eduardo Habkost <=
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Marcelo Tosatti, 2016/10/17
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Radim Krčmář, 2016/10/18
- Re: [Qemu-devel] invtsc + migration + TSC scaling, Dr. David Alan Gilbert, 2016/10/18