qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/1] migration: calculate expected_downtime w


From: David Gibson
Subject: Re: [Qemu-devel] [PATCH v2 1/1] migration: calculate expected_downtime with ram_bytes_remaining()
Date: Thu, 3 May 2018 12:14:42 +1000
User-agent: Mutt/1.9.3 (2018-01-21)

On Sun, Apr 22, 2018 at 12:42:49AM +0530, Balamuruhan S wrote:
> On Thu, Apr 19, 2018 at 09:48:17PM +1000, David Gibson wrote:
> > On Thu, Apr 19, 2018 at 10:14:52AM +0530, Balamuruhan S wrote:
> > > On Wed, Apr 18, 2018 at 09:36:33AM +0100, Dr. David Alan Gilbert wrote:
> > > > * Balamuruhan S (address@hidden) wrote:
[snip]
> > That said, I thought a bunch about this a bunch, and I think there is
> > a case to be made for it - although it's a lot more subtle than what's
> > been suggested so far.
> > 
> > So.  AFAICT the estimate of page dirty rate is based on the assumption
> > that page dirties are independent of each other - one page is as
> > likely to be dirtied as any other.  If we don't make that assumption,
> > I don't see how we can really have an estimate as a single number.
> > 
> > But if that's the assumption, then predicting downtime based on it is
> > futile: if the dirty rate is less than bandwidth, we can wait long
> > enough and make the downtime as small as we want.  If the dirty rate
> > is higher than bandwidth, then we don't converge and no downtime short
> > of (ram size / bandwidth) will be sufficient.
> > 
> > The only way a predicted downtime makes any sense is if we assume that
> > although the "instantaneous" dirty rate is high, the pages being
> > dirtied are within a working set that's substantially smaller than the
> > full RAM size.  In that case the expected down time becomes (working
> > set size / bandwidth).
> 
> Thank you Dave and David for such a nice explanation and for your time.
> 
> I thought about it after the explanation given by you and Dave, so in
> expected downtime we are trying to predict downtime based on some
> values at that instant, so we need to use that value and integrate it.

No, not really.  The problem is that as you accumulate dirties over a
longer interval, you'll get more duplicate dirties, which means you'll
get a lower effective value than simply integrating the results over
shorter intervals.

> 1. we are currently using bandwidth but actually I think we have to use
> rate of change of bandwidth, because bandwidth is not constant always.

Again, not really.  It's true that bandwidth isn't necessarily
constant, but in most cases it will be pretty close.  The real noise
here is coming in the dirty rate.

> 2. we are using dirty_pages_rate and as Dave suggested,
> 
> when we enter an iteration with 'Db' bytes dirty we should be
> considering ['Db' + Dr * iteration time of previous one], where for the first
> iteration, iteration time of previous would be 0.
> 
> 3. As you have said, that ram_bytes_remaining / bandwidth is the time to
> transfer all RAM, so this should be the limit for our integration. when
> we calculate for any instant it would be 0 to ram_bytes_remaining /
> bandwidth at that instant.
> 
> Regards,
> Bala
> 
> > 
> > Predicting downtime as (ram_bytes_remaining / bandwidth) is
> > essentially always wrong early in the migration, although it will be a
> > poor upper bound - it will basically give you the time to transfer all
> > RAM.
> > 
> > For a nicely converging migration it will also be wrong (but an upper
> > bound) until it isn't: it will gradually decrease until it dips below
> > the requested downtime threshold, at which point the migration
> > completes.
> > 
> > For a diverging migration with a working set, as discussed above,
> > ram_bytes_remaining will eventually converge on (roughly) the size of
> > that working set - it won't dip (much) below that, because we can't
> > keep up with the dirties within that working set.  At that point this
> > does become a reasonable estimate of the necessary downtime in order
> > to get the migration to complete, which I believe is the point of the
> > value.
> > 
> > So the question is: for the purposes of this value, is a gross
> > overestimate that gradually approaches a reasonable value good enough?
> > 
> > An estimate that would get closer, quicker would be (ram dirtied in
> > interval) / bandwidth.  Where (ram dirtied in interval) is a measure
> > of total ram dirtied over some measurement interval - only counting a
> > page once if its dirtied multiple times during the interval.  And
> > obviously you'd want some sort of averaging on that.  I think that
> > would be a bit of a pain to measure, though.
> > 
> 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]