qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V6 05/14] migration: propagate suspended runstate


From: Fabiano Rosas
Subject: Re: [PATCH V6 05/14] migration: propagate suspended runstate
Date: Mon, 04 Dec 2023 18:09:16 -0300

Peter Xu <peterx@redhat.com> writes:

> On Mon, Dec 04, 2023 at 04:31:56PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
>> >> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque, 
>> >> >> int version_id)
>> >> >>          return -EINVAL;
>> >> >>      }
>> >> >>      s->state = r;
>> >> >> +    vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
>> >> > 
>> >> > IIUC current vm_was_suspended (based on my read of your patch) was not 
>> >> > the
>> >> > same as a boolean representing "whether VM is suspended", but only a
>> >> > temporary field to remember that for a VM stop request.  To be 
>> >> > explicit, I
>> >> > didn't see this flag set in qemu_system_suspend() in your previous 
>> >> > patch.
>> >> > 
>> >> > If so, we can already do:
>> >> > 
>> >> >   vm_set_suspended(s->vm_was_suspended);
>> >> > 
>> >> > Irrelevant of RUN_STATE_SUSPENDED?
>> >> 
>> >> We need both terms of the expression.
>> >> 
>> >> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended = 
>> >> false.
>> >> We call global_state_store prior to vm_stop_force_state, so the incoming
>> >> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
>> >
>> > Right.
>> >
>> >> However, the runstate is RUN_STATE_INMIGRATE.  When incoming finishes by
>> >> calling vm_start, we need to restore the suspended state.  Thus in 
>> >> global_state_post_load, we must set vm_was_suspended = true.
>> >
>> > With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
>> > already?  Then I think it should call vm_start(SUSPENDED) if to start.
>> >
>> > Maybe you're talking about the special case where autostart==false?  We
>> > used to have this (existing process_incoming_migration_bh()):
>> >
>> >     if (!global_state_received() ||
>> >         global_state_get_runstate() == RUN_STATE_RUNNING) {
>> >         if (autostart) {
>> >             vm_start();
>> >         } else {
>> >             runstate_set(RUN_STATE_PAUSED);
>> >         }
>> >     }
>> >
>> > If so maybe I get you, because in the "else" path we do seem to lose the
>> > SUSPENDED state again, but in that case IMHO we should logically set
>> > vm_was_suspended only when we "lose" it - we didn't lose it during
>> > migration, but only until we decided to switch to PAUSED (due to
>> > autostart==false). IOW, change above to something like:
>> >
>> >     state = global_state_get_runstate();
>> >     if (!global_state_received() || runstate_is_alive(state)) {
>> >         if (autostart) {
>> >             vm_start(state);
>> >         } else {
>> >             if (runstate_is_suspended(state)) {
>> >                 /* Remember suspended state before setting system to 
>> > STOPed */
>> >                 vm_was_suspended = true;
>> >             }
>> >             runstate_set(RUN_STATE_PAUSED);
>> >         }
>> >     }
>> >
>> > It may or may not have a functional difference even if current patch,
>> > though.  However maybe clearer to follow vm_was_suspended's strict
>> > definition.
>> >
>> >> 
>> >> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
>> >> then vm_was_suspended = true.  Migration from that state sets
>> >> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load 
>> >> and 
>> >> ends with runstate_set(RUN_STATE_PAUSED).
>> >> 
>> >> I will add a comment here in the code.
>> >>  
>> >> >>      return 0;
>> >> >>  }
>> >> >> @@ -134,6 +143,7 @@ static const VMStateDescription 
>> >> >> vmstate_globalstate = {
>> >> >>      .fields = (VMStateField[]) {
>> >> >>          VMSTATE_UINT32(size, GlobalState),
>> >> >>          VMSTATE_BUFFER(runstate, GlobalState),
>> >> >> +        VMSTATE_BOOL(vm_was_suspended, GlobalState),
>> >> >>          VMSTATE_END_OF_LIST()
>> >> >>      },
>> >> >>  };
>> >> > 
>> >> > I think this will break migration between old/new, unfortunately.  And
>> >> > since the global state exist mostly for every VM, all VM setup should be
>> >> > affected, and over all archs.
>> >> 
>> >> Thanks, I keep forgetting that my binary tricks are no good here.  
>> >> However,
>> >> I have one other trick up my sleeve, which is to store vm_was_running in
>> >> global_state.runstate[strlen(runstate) + 2].  It is forwards and backwards
>> >> compatible, since that byte is always 0 in older qemu.  It can be 
>> >> implemented
>> >> with a few lines of code change confined to global_state.c, versus many 
>> >> lines 
>> >> spread across files to do it the conventional way using a compat property 
>> >> and
>> >> a subsection.  Sound OK?  
>> >
>> > Tricky!  But sounds okay to me.  I think you're inventing some of your own
>> > way of being compatible, not relying on machine type as a benefit.  If go
>> > this route please document clearly on the layout and also what it looked
>> > like in old binaries.
>> >
>> > I think maybe it'll be good to keep using strings, so in the new binaries
>> > we allow >1 strings, then we define properly on those strings (index 0:
>> > runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
>> > express, while 0x00 means old binary, etc.).
>> >
>> > I hope this trick will need less code than the subsection solution,
>> > otherwise I'd still consider going with that, which is the "common
>> > solution".
>> >
>> > Let's also see whether Juan/Fabiano/others has any opinions.
>> 
>> Can't we pack the structure and just go ahead and slash 'runstate' in
>> half? That would claim some unused bytes for future backward
>> compatibility issues.
>
> What I meant is something like:
>
>   runstate[100] = {"str1", 0x00, "str2", 0x00, ...}
>
> Where str1 is runstate, and str2 can be either "0"/"1" to reflect suspended
> value.  We define all the strings separated by 0x00, then IIUC we save the
> most chars for potential future extension of this string.
>
> Thanks,

Right, I got your point. I just think we could avoid designing this new
string format by creating new fields with the extra space:

typedef struct QEMU_PACKED {
    uint32_t size;
    uint8_t runstate[50];
    uint8_t unused[50];
    RunState state;
    bool received;
} GlobalState;

In my mind this works seamlessly, or am I mistaken?

In any case, a oneshot hack might be better than both our suggestions
because we can just clean it up a couple of releases from now as if
nothing happened.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]