|
From: | Alexander Graf |
Subject: | Re: [Qemu-devel] [PATCH] kvmclock: Ensure time in migration never goes backward |
Date: | Thu, 08 May 2014 01:29:33 +0200 |
User-agent: | Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 |
On 08.05.14 01:21, Marcelo Tosatti wrote:
On Tue, May 06, 2014 at 09:18:27AM +0200, Alexander Graf wrote:On 06.05.14 01:31, Marcelo Tosatti wrote:On Mon, May 05, 2014 at 08:23:43PM -0300, Marcelo Tosatti wrote:Hi Alexander, On Mon, May 05, 2014 at 03:51:22PM +0200, Alexander Graf wrote:When we migrate we ask the kernel about its current belief on what the guest time would be.KVM_GET_CLOCK which returns the time in "struct kvm_clock_data".However, I've seen cases where the kvmclock guest structure indicates a time more recent than the kvm returned time.This should not happen because the value returned by KVM_GET_CLOCK (get_kernel_ns() + kvmclock_offset) should be relatively in sync with what is seen in the guest via kvmclock read.Yes, and it isn't. Any ideas why it's not? This patch really just uses the guest visible kvmclock time rather than the host view of it on migration.Effective frequency of TSC clock could be higher than NTP frequency. So NTP correction would slow down the host clock.There is definitely something very broken on the host's side since it does return a smaller time than the guest exposed interface indicates. AlexCan you please retrieve the values of system_timestamp, tsc_timestamp and the host clock on the trace you have ?
These are the kvmclock data fields at the point in time of migration: KVM Time |- version 0x22 |- tsc_timestamp 0xa6af10dbce |- system_time 0x4eac2ae060 |- tsc_to_system_mul 0xf28f5431 |- tsc_shift 0xffffffff |- flags 0 and this is what other bits I could fetch from the migration: "env.tsc_offset": "0x0000000000000000", "env.tsc": "0x000004288b11c6f8", "env.tsc_aux": "0x0000000000000000", "timer (0)": { "cpu_ticks_offset": "0x00000427ee4d1efd", "dummy": "0x0000000000000000", "cpu_clock_offset": "0x000001f803249a8d" }, "kvmclock (7)": { "clock": "0x000001f80325f418" },I'm not sure how I could get access to the host clock, as this is a post mortem analysis and I haven't been able to reproduce this myself. However, I'm sure the two other folks who already replied to the thread would be more than happy to run something if we tell them what :).
Nothing forbids backwards time even without migration, in case the problem is the difference in frequency between TSC and NTP corrected host clock (point is, should figure out what is happening).
Yes, I fully agree. We still need a patch similar to this one to ensure that we can fetch a guest from a broken host though. Maybe we should just have a CAP that indicates the host is working once we fix kvm and bump the QEMU kvmclock version to v2 with a new field that means "trust the value". Default it to "no" for v1 migrations and when it's set to "no", use the guest values instead.
Alex
[Prev in Thread] | Current Thread | [Next in Thread] |