Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when r

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when r

From:	Dr. David Alan Gilbert
Subject:	Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed
Date:	Fri, 4 Aug 2017 10:52:27 +0100
User-agent:	Mutt/1.8.3 (2017-05-23)

* Peter Xu (address@hidden) wrote:
> On Thu, Aug 03, 2017 at 02:54:35PM +0100, Dr. David Alan Gilbert wrote:
> > * Peter Xu (address@hidden) wrote:
> > > Firstly, MigThrError enumeration is introduced to describe the error in
> > > migration_detect_error() better. This gives the migration_thread() a
> > > chance to know whether a recovery has happened.
> > > 
> > > Then, if a recovery is detected, migration_thread() will reset its local
> > > variables to prepare for that.
> > > 
> > > Signed-off-by: Peter Xu <address@hidden>
> > > ---
> > >  migration/migration.c | 40 +++++++++++++++++++++++++++++-----------
> > >  1 file changed, 29 insertions(+), 11 deletions(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index ecebe30..439bc22 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -2159,6 +2159,15 @@ static bool postcopy_should_start(MigrationState 
> > > *s)
> > >      return atomic_read(&s->start_postcopy) || s->start_postcopy_fast;
> > >  }
> > >  
> > > +typedef enum MigThrError {
> > > +    /* No error detected */
> > > +    MIG_THR_ERR_NONE = 0,
> > > +    /* Detected error, but resumed successfully */
> > > +    MIG_THR_ERR_RECOVERED = 1,
> > > +    /* Detected fatal error, need to exit */
> > > +    MIG_THR_ERR_FATAL = 2,
> > > +} MigThrError;
> > > +
> > 
> > Could you move this patch earlier to when postcopy_pause is created
> > so it's created with this enum?
> 
> Sure.
> 
> [...]
> 
> > > @@ -2319,6 +2327,7 @@ static void *migration_thread(void *opaque)
> > >      /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */
> > >      enum MigrationStatus current_active_state = MIGRATION_STATUS_ACTIVE;
> > >      bool enable_colo = migrate_colo_enabled();
> > > +    MigThrError thr_error;
> > >  
> > >      rcu_register_thread();
> > >  
> > > @@ -2395,8 +2404,17 @@ static void *migration_thread(void *opaque)
> > >           * Try to detect any kind of failures, and see whether we
> > >           * should stop the migration now.
> > >           */
> > > -        if (migration_detect_error(s)) {
> > > +        thr_error = migration_detect_error(s);
> > > +        if (thr_error == MIG_THR_ERR_FATAL) {
> > > +            /* Stop migration */
> > >              break;
> > > +        } else if (thr_error == MIG_THR_ERR_RECOVERED) {
> > > +            /*
> > > +             * Just recovered from a e.g. network failure, reset all
> > > +             * the local variables.
> > > +             */
> > > +            initial_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> > > +            initial_bytes = 0;
> > 
> > They don't seem that important to reset?
> 
> The problem is that we have this in migration_thread():
> 
>         if (current_time >= initial_time + BUFFER_DELAY) {
>             uint64_t transferred_bytes = qemu_ftell(s->to_dst_file) -
>                                          initial_bytes;
>             uint64_t time_spent = current_time - initial_time;
>             double bandwidth = (double)transferred_bytes / time_spent;
>             threshold_size = bandwidth * s->parameters.downtime_limit;
>             ...
>         }
> 
> Here qemu_ftell() would possibly be very small since we have just
> resumed... and then transferred_bytes will be extremely huge since
> "qemu_ftell(s->to_dst_file) - initial_bytes" is actually negative...
> Then, with luck, we'll got extremely huge "bandwidth" as well.

Ah yes that's a good reason to reset it then; add a comment like
'important to avoid breaking transferred_bytes and bandwidth
calculation'

Dave

> -- 
> Peter Xu
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed, Dr. David Alan Gilbert, 2017/08/03
- Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed, Peter Xu, 2017/08/04
  - Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed, Dr. David Alan Gilbert <=
    - Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed, Peter Xu, 2017/08/07

Prev by Date: Re: [Qemu-devel] [PATCH 10/15] target/arm: Don't use cpsr_write/cpsr_read to transfer M profile XPSR
Next by Date: Re: [Qemu-devel] [RFC 28/29] migration: final handshake for the resume
Previous by thread: Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed
Next by thread: Re: [Qemu-devel] [RFC 29/29] migration: reset migrate thread vars when resumed
Index(es):
- Date
- Thread