qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] latest rc: virtio-blk hangs forever after migration


From: Andrey Korolyov
Subject: Re: [Qemu-devel] latest rc: virtio-blk hangs forever after migration
Date: Wed, 16 Jul 2014 01:25:17 +0400

On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <address@hidden> wrote:
> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote:
>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <address@hidden> wrote:
>> > On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <address@hidden> wrote:
>> >> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote:
>> >>> Hello,
>> >>>
>> >>> the issue is not specific to the iothread code because generic
>> >>> virtio-blk also hangs up:
>> >>
>> >> Do you know which version works well?  If you could bisect, that'll
>> >> help a lot.
>> >>
>> >> Thanks,
>> >>                 Amit
>> >
>> > Hi,
>> >
>> > 2.0 works definitely well. I`ll try to finish bisection today, though
>> > every step takes about 10 minutes to complete.
>>
>> Yay! It is even outside of virtio-blk.
>>
>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56
>> Author: Marcelo Tosatti <address@hidden>
>> Date:   Tue Jun 3 13:34:48 2014 -0300
>>
>>     kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec 
>> calculation
>>
>>     Ensure proper env->tsc value for kvmclock_current_nsec calculation.
>>
>>     Reported-by: Marcin Gibuła <address@hidden>
>>     Cc: address@hidden
>>     Signed-off-by: Marcelo Tosatti <address@hidden>
>>     Signed-off-by: Paolo Bonzini <address@hidden>
>
> Andrey,
>
> Can you please provide instructions on how to create reproducible
> environment?
>
> The following patch is equivalent to the original patch, for
> the purposes of fixing the kvmclock problem.
>
> Perhaps it becomes easier to spot the reason for the hang you are
> experiencing.
>
>
> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c
> index 272a88a..feb5fc5 100644
> --- a/hw/i386/kvm/clock.c
> +++ b/hw/i386/kvm/clock.c
> @@ -17,7 +17,6 @@
>  #include "qemu/host-utils.h"
>  #include "sysemu/sysemu.h"
>  #include "sysemu/kvm.h"
> -#include "sysemu/cpus.h"
>  #include "hw/sysbus.h"
>  #include "hw/kvm/clock.h"
>
> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s)
>
>      cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time));
>
> -    assert(time.tsc_timestamp <= migration_tsc);
>      delta = migration_tsc - time.tsc_timestamp;
>      if (time.tsc_shift < 0) {
>          delta >>= -time.tsc_shift;
> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, int 
> running,
>          if (s->clock_valid) {
>              return;
>          }
> -
> -        cpu_synchronize_all_states();
>          ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data);
>          if (ret < 0) {
>              fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret));
> diff --git a/migration.c b/migration.c
> index 8d675b3..34f2325 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque)
>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
>                  old_vm_running = runstate_is_running();
>
> +                cpu_synchronize_all_states();
>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>                  if (ret >= 0) {
>                      qemu_file_set_rate_limit(s->file, INT64_MAX);

Marcelo, I do not see way easier than creating PoC deployment
(involving at least two separated physical nodes) which will act for
both as a sender and receiver for migration and for Ceph storage
(http://ceph.com/docs/master/start/). For simplicity you probably want
to disable cephx, therefore not putting the secret in the CLI. Also
you may receive minimal qemu-ready installation using Mirantis` Fuel
with Ceph deployment settings (it`ll deploy some Openstack too as a
side effect, but the main reason to do things this way is a very high
level of provisioning automation, you`ll get necessary environment
with multi-node setting with RBD backend in matter of some clicks and
some hours). In a meantime, I`ll try to reproduce the issue with
iscsi, because I do not want to mess with shared storage and sanlock
plugin.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]