[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence ru
From: |
Uri Lublin |
Subject: |
Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule |
Date: |
Wed, 20 May 2009 20:17:39 +0300 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b3pre) Gecko/20090513 Fedora/3.0-2.3.beta2.fc11 Lightning/1.0pre Thunderbird/3.0b2 |
On 05/19/2009 09:15 PM, Anthony Liguori wrote:
Dor Laor wrote:
The problem is that if migration is not progressing since the guest is
dirtying pages
faster than the migration protocol can send, than we just waist time
and cpu.
The minimum is to notify the monitor interface in order to let mgmt
daemon to trap it.
We can easily see this issue while running iperf in the guest or any
other high load/dirty
pages scenario.
The problem is, what's the metric for determining the guest isn't
progressing? A raw iteration count is not a valid metric. It may be
expected that the migration take 50 iterations.
We've defined "no-progress" as a memory transfer iteration where the number of
pages that got dirty is larger than the number of pages transferred. For such
iterations we have more data to transfer when the iteration completes.
Note that we did not limit the number of iterations (yet), we want to limit the
number of no-progress iterations. Migrations with many such iterations just
waste resources (cpu, network, etc).
The management tool knows the guest isn't progressing when it decides
that a guest isn't progressing :-)
Currently the management tool only knows the migration is still active.
We can also make it configurable using the monitor migrate command.
For example:
migrate -d -no_progress -threshold=x tcp:....
Theshold is really a bad metric to use. You have no idea how much data
has been passed in each iteration. If you only needed one more
iteration, then stopping the migration short was a really bad idea.
You can never know there is only one more iteration needed, no matter what
metric you use.
Again this threshold limits the number of no-progress iterations.
We can extend this rule (or add another flag/command) to enlarge the bandwidth
limitation upon a no-progress iteration.
The only thing that this does is give a false sense of security.
Management tools have to deal with forcing migration convergence based
on policies. If a management tool isn't doing this today, it's broken IMHO.
I agree migration convergence rules should be based on policies.
What Dor is suggesting is that the management tool do that by passing parameters
to the migrate command (or using other migrate_X monitor commands).
I'm not sure management tools can have good such policies today. The only
information they have is how much time passed since the migration started.
The only actions they can take is stop the guest or cancel the migration.
Basically, threshold introduces a regression. If you run iperf and
migrate a guest with a very large memory size, after migration, you'll
get soft lockups because the guest hasn't been running for 10 seconds.
This is bad.
Just keep resending pages that are constantly changing is bad too, probably
worse.
Regards,
Uri.
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, (continued)
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Anthony Liguori, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Glauber Costa, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Dor Laor, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Glauber Costa, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Anthony Liguori, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Uri Lublin, 2009/05/20
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Blue Swirl, 2009/05/20
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Uri Lublin, 2009/05/20
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Anthony Liguori, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Anthony Liguori, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule,
Uri Lublin <=
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Anthony Liguori, 2009/05/19
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Uri Lublin, 2009/05/20
- Re: [Qemu-devel] [PATCH] ram_save_live: add a no-progress convergence rule, Daniel P. Berrange, 2009/05/20