qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM


From: Juan Quintela
Subject: Re: [Qemu-devel] [PATCH] migration: flush the bdrv before stopping VM
Date: Tue, 17 Mar 2015 13:12:16 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.4 (gnu/linux)

Liang Li <address@hidden> wrote:
> If there are file write operations in the guest when doing live
> migration, the VM downtime will be much longer than the max_downtime,
> this is caused by bdrv_flush_all(), this function is a time consuming
> operation if there a lot of data have to be flushed to disk.
>
> By adding bdrv_flush_all() before VM stop, we can reduce the time
> consumed by bdrv_flush_all() in vm_stop_force_state, this means the
> VM down time can be reduced.
>
> The test shows this optimization can help to reduce the VM downtime
> from more than 20 seconds to about 100 milliseconds.
>
> Signed-off-by: Liang Li <address@hidden>


This needs further review/changes on the block layer.

First explanation, why I think this don't fix the full problem.
Whith this patch, we fix the problem where we have a dirty block layer
but basically nothing dirtying the memory on the guest (we are moving
the 20 seconds from max_downtime for the blocklayer flush), to 20
seconds until we have decided that the amount of dirty memory is small
enough to be transferred during max_downtime.  But it is still going to
take 20 seconds to flush the block layer, and during that 20 seconds,
the amount of memory that can be dirty is HUGE.

I think our ouptions are:

- tell the block layer at the beggining of migration
  Hey, we are migrating, could you please start flusing data now, and
  don't get the caches to grow too much, please, pretty please.
  (I left the API to the block layer)
- Add on that point a new function:
  bdrvr_flush_all_start()
  That starts the sending of pages, and we "hope" that by the time that
  we have migrated all memory, they have also finished (so our last
  call to block_flush_all() have less work to do)
- Add another function:
  int bdrv_flush_all_timeout(int timeout)
  that returns if timeout pass, telling if it has migrated all pages or
  timeout has passed.  So we can got back to the iterative stage if it
  has taken too long.

Notice that *normally* bdrv_flush_all() is very fast, the problem is
that sometimes it get really, really slow (NFS decided to go slow, TCP
drop a packet, whatever).

Right now, we don't have an interface to detect that cases and got back
to the iterative stage.

So, I agree whit the diagnosis that there is a problem there, but I
think that the solution is more complex that this.  You helped one load
making a different other worse.  I am not sure which of the two
compromises is better :-(

Makes this sense?

Later, Juan.


> ---
>  migration/migration.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 2c805f1..fc4735c 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -655,6 +655,10 @@ static void *migration_thread(void *opaque)
>                  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
>                  old_vm_running = runstate_is_running();
>  
> +                /* do flush here is aimed to shorten the VM downtime,
> +                 * bdrv_flush_all is a time consuming operation
> +                 * when the guest has done some file writing */
> +                bdrv_flush_all();
>                  ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>                  if (ret >= 0) {
>                      qemu_file_set_rate_limit(s->file, INT64_MAX);



reply via email to

[Prev in Thread] Current Thread [Next in Thread]