qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] Block Migration and CPU throttling


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] Block Migration and CPU throttling
Date: Wed, 7 Feb 2018 18:29:30 +0000
User-agent: Mutt/1.9.1 (2017-09-22)

* Peter Lieven (address@hidden) wrote:
> Am 12.12.2017 um 18:05 schrieb Dr. David Alan Gilbert:
> > * Peter Lieven (address@hidden) wrote:
> > > Am 21.09.2017 um 14:36 schrieb Dr. David Alan Gilbert:
> > > > * Peter Lieven (address@hidden) wrote:
> > > > > Am 19.09.2017 um 16:41 schrieb Dr. David Alan Gilbert:
> > > > > > * Peter Lieven (address@hidden) wrote:
> > > > > > > Am 19.09.2017 um 16:38 schrieb Dr. David Alan Gilbert:
> > > > > > > > * Peter Lieven (address@hidden) wrote:
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > I just noticed that CPU throttling and Block Migration don't 
> > > > > > > > > work together very well.
> > > > > > > > > During block migration the throttling heuristic detects that 
> > > > > > > > > we obviously make no progress
> > > > > > > > > in ram transfer. But the reason is the running block 
> > > > > > > > > migration and not a too high dirty pages rate.
> > > > > > > > > 
> > > > > > > > > The result is that any VM is throttled by 99% during block 
> > > > > > > > > migration.
> > > > > > > > Hmm that's unfortunate; do you have a bandwidth set lower than 
> > > > > > > > your
> > > > > > > > actual network connection? I'm just wondering if it's actually 
> > > > > > > > going
> > > > > > > > between the block and RAM iterative sections or getting stuck 
> > > > > > > > in ne.
> > > > > > > It happens also if source and dest are on the same machine and 
> > > > > > > speed is set to 100G.
> > > > > > But does it happen if they're not and the speed is set low?
> > > > > Yes, it does. I noticed it in our test environment between different 
> > > > > nodes with a 10G
> > > > > link in between. But its totally clear why it happens. During block 
> > > > > migration we transfer
> > > > > all dirty memory pages in each round (if there is moderate memory 
> > > > > load), but all dirty
> > > > > pages are obviously more than 50% of the transferred ram in that 
> > > > > round.
> > > > > It is more exactly 100%. But the current logic triggers on this 
> > > > > condition.
> > > > > 
> > > > > I think I will go forward and send a patch which disables auto 
> > > > > converge during
> > > > > block migration bulk stage.
> > > > Yes, that's fair;  it probably would also make sense to throttle the RAM
> > > > migration during the block migration bulk stage, since the chances are
> > > > it's not going to get far.  (I think in the nbd setup, the main
> > > > migration process isn't started until the end of bulk).
> > > Catching up with the idea of delaying ram migration until block bulk has 
> > > completed.
> > > What do you think is the easiest way to achieve this?
> > <excavates inbox, and notices I never replied>
> > 
> > I think the answer depends whether we think this is a 'special' or we
> > need a new general purpose mechanism.
> > 
> > If it was really general then we'd probably want to split the iterative
> > stage in two somehow, and only do RAM in the second half.
> > 
> > But I'm not sure it's worth it; I suspect the easiest way is:
> > 
> >     a) Add a counter in migration/ram.c or in the RAM state somewhere
> >     b) Make ram_save_inhibit increment the counter
> >     c) Check the counter at the head of ram_save_iterate and just exit
> >       if it's none 0
> >     d) Call ram_save_inhibit from block_save_setup
> >     e) Then release it when you've finished the bulk stage
> > 
> > Make sure you still count the RAM in the pending totals, otherwise
> > migration might think it's finished a bit early.
> 
> Is there any culprit I don't see or is it as easy as this?

Hmm, looks promising doesn't it;  might need an include or two tidied
up, but looks worth a try.   Just be careful that there are no cases
where block migration can't transfer data in that state, otherwise we'll
keep coming back to here and spewing empty sections.

Dave

> diff --git a/migration/ram.c b/migration/ram.c
> index cb1950f..c67bcf1 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2255,6 +2255,13 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      int64_t t0;
>      int done = 0;
> 
> +    if (blk_mig_bulk_active()) {
> +        /* Avoid transferring RAM during bulk phase of block migration as
> +         * the bulk phase will usually take a lot of time and transferring
> +         * RAM updates again and again is pointless. */
> +        goto out;
> +    }
> +
>      rcu_read_lock();
>      if (ram_list.version != rs->last_version) {
>          ram_state_reset(rs);
> @@ -2301,6 +2308,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>       */
>      ram_control_after_iterate(f, RAM_CONTROL_ROUND);
> 
> +out:
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>      ram_counters.transferred += 8;
> 
> 
> Peter
> 
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]