qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-comp


From: Dr. David Alan Gilbert
Subject: Re: [PATCH 1/2] qapi/run-state: Add a new shutdown cause 'migration-completed'
Date: Tue, 6 Jul 2021 11:27:15 +0100
User-agent: Mutt/2.0.7 (2021-05-04)

* Kunkun Jiang (jiangkunkun@huawei.com) wrote:
> Hi Daniel,
> 
> On 2021/7/5 20:48, Daniel P. Berrangé wrote:
> > On Mon, Jul 05, 2021 at 08:36:52PM +0800, Kunkun Jiang wrote:
> > > In the current version, the source QEMU process does not automatic
> > > exit after a successful migration. Additional action is required,
> > > such as sending { "execute": "quit" } or ctrl+c. For simplify, add
> > > a new shutdown cause 'migration-completed' to exit the source QEMU
> > > process after a successful migration.
> > IIUC, 'STATUS_COMPLETED' state is entered on the source host
> > once it has finished sending all VM state, and thus does not
> > guarantee that the target host has successfully received and
> > loaded all VM state.
> Thanks for your reply.
> 
> If the target host doesn't successfully receive and load all VM state,
> we can send { "execute": "cont" } to resume the soruce in time to
> ensure that VM will not lost?

Yes, that's pretty common at the moment;  the failed migration can
happen at lots of different points:
  a) The last part of the actual migration stream/loading the devices
    - that's pretty easy, since the destination hasn't actually got
    the full migration stream.

  b) If the migration itself completes, but then the management system
    then tries to reconfigure the networking/storage on the destination,
    and something goes wrong in that, then it can roll that back and
    cont on the source.

So, it's a pretty common type of failure/recovery  - the management
application has to be a bit careful not to do anything destructive
until as late as possible, so it knows it can switch back.

> > Typically a mgmt app will need to directly confirm that the
> > target host QEMU has succesfully started running, before it
> > will tell the source QEMU to quit.
> 'a mgmt app', such as libvirt?

Yes, it's currently libvirt that does that; but any of the control
things could (it's just libvirt has been going long enough so it knows
about lots and lots of nasty cases of migration failure, and recovering
properly).

Can you explain why did you want to get the source to automatically
quit?  In a real setup where does it help?

Dave


> Thanks,
> Kunkun Jiang
> > So, AFAICT, this automatic exit after STATUS_COMPLETED is
> > not safe and could lead to total loss of the running VM in
> > error scenarios.
> > 
> > 
> > 
> > > Signed-off-by: Kunkun Jiang <jiangkunkun@huawei.com>
> > > ---
> > >   migration/migration.c | 1 +
> > >   qapi/run-state.json   | 4 +++-
> > >   2 files changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 4228635d18..16782c93c2 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -3539,6 +3539,7 @@ static void 
> > > migration_iteration_finish(MigrationState *s)
> > >       case MIGRATION_STATUS_COMPLETED:
> > >           migration_calculate_complete(s);
> > >           runstate_set(RUN_STATE_POSTMIGRATE);
> > > +        qemu_system_shutdown_request(SHUTDOWN_CAUSE_MIGRATION_COMPLETED);
> > >           break;
> > >       case MIGRATION_STATUS_ACTIVE:
> > > diff --git a/qapi/run-state.json b/qapi/run-state.json
> > > index 43d66d700f..66aaef4e2b 100644
> > > --- a/qapi/run-state.json
> > > +++ b/qapi/run-state.json
> > > @@ -86,12 +86,14 @@
> > >   #                   ignores --no-reboot. This is useful for sanitizing
> > >   #                   hypercalls on s390 that are used during 
> > > kexec/kdump/boot
> > >   #
> > > +# @migration-completed: Reaction to the successful migration
> > > +#
> > >   ##
> > >   { 'enum': 'ShutdownCause',
> > >     # Beware, shutdown_caused_by_guest() depends on enumeration order
> > >     'data': [ 'none', 'host-error', 'host-qmp-quit', 
> > > 'host-qmp-system-reset',
> > >               'host-signal', 'host-ui', 'guest-shutdown', 'guest-reset',
> > > -            'guest-panic', 'subsystem-reset'] }
> > > +            'guest-panic', 'subsystem-reset', 'migration-completed'] }
> > >   ##
> > >   # @StatusInfo:
> > > -- 
> > > 2.23.0
> > > 
> > > 
> > Regards,
> > Daniel
> 
> 
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]