[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH V6 05/14] migration: propagate suspended runstate
From: |
Fabiano Rosas |
Subject: |
Re: [PATCH V6 05/14] migration: propagate suspended runstate |
Date: |
Mon, 04 Dec 2023 18:09:16 -0300 |
Peter Xu <peterx@redhat.com> writes:
> On Mon, Dec 04, 2023 at 04:31:56PM -0300, Fabiano Rosas wrote:
>> Peter Xu <peterx@redhat.com> writes:
>>
>> > On Fri, Dec 01, 2023 at 11:23:33AM -0500, Steven Sistare wrote:
>> >> >> @@ -109,6 +117,7 @@ static int global_state_post_load(void *opaque,
>> >> >> int version_id)
>> >> >> return -EINVAL;
>> >> >> }
>> >> >> s->state = r;
>> >> >> + vm_set_suspended(s->vm_was_suspended || r == RUN_STATE_SUSPENDED);
>> >> >
>> >> > IIUC current vm_was_suspended (based on my read of your patch) was not
>> >> > the
>> >> > same as a boolean representing "whether VM is suspended", but only a
>> >> > temporary field to remember that for a VM stop request. To be
>> >> > explicit, I
>> >> > didn't see this flag set in qemu_system_suspend() in your previous
>> >> > patch.
>> >> >
>> >> > If so, we can already do:
>> >> >
>> >> > vm_set_suspended(s->vm_was_suspended);
>> >> >
>> >> > Irrelevant of RUN_STATE_SUSPENDED?
>> >>
>> >> We need both terms of the expression.
>> >>
>> >> If the vm *is* suspended (RUN_STATE_SUSPENDED), then vm_was_suspended =
>> >> false.
>> >> We call global_state_store prior to vm_stop_force_state, so the incoming
>> >> side sees s->state = RUN_STATE_SUSPENDED and s->vm_was_suspended = false.
>> >
>> > Right.
>> >
>> >> However, the runstate is RUN_STATE_INMIGRATE. When incoming finishes by
>> >> calling vm_start, we need to restore the suspended state. Thus in
>> >> global_state_post_load, we must set vm_was_suspended = true.
>> >
>> > With above, shouldn't global_state_get_runstate() (on dest) fetch SUSPENDED
>> > already? Then I think it should call vm_start(SUSPENDED) if to start.
>> >
>> > Maybe you're talking about the special case where autostart==false? We
>> > used to have this (existing process_incoming_migration_bh()):
>> >
>> > if (!global_state_received() ||
>> > global_state_get_runstate() == RUN_STATE_RUNNING) {
>> > if (autostart) {
>> > vm_start();
>> > } else {
>> > runstate_set(RUN_STATE_PAUSED);
>> > }
>> > }
>> >
>> > If so maybe I get you, because in the "else" path we do seem to lose the
>> > SUSPENDED state again, but in that case IMHO we should logically set
>> > vm_was_suspended only when we "lose" it - we didn't lose it during
>> > migration, but only until we decided to switch to PAUSED (due to
>> > autostart==false). IOW, change above to something like:
>> >
>> > state = global_state_get_runstate();
>> > if (!global_state_received() || runstate_is_alive(state)) {
>> > if (autostart) {
>> > vm_start(state);
>> > } else {
>> > if (runstate_is_suspended(state)) {
>> > /* Remember suspended state before setting system to
>> > STOPed */
>> > vm_was_suspended = true;
>> > }
>> > runstate_set(RUN_STATE_PAUSED);
>> > }
>> > }
>> >
>> > It may or may not have a functional difference even if current patch,
>> > though. However maybe clearer to follow vm_was_suspended's strict
>> > definition.
>> >
>> >>
>> >> If the vm *was* suspended, but is currently stopped (eg RUN_STATE_PAUSED),
>> >> then vm_was_suspended = true. Migration from that state sets
>> >> vm_was_suspended = s->vm_was_suspended = true in global_state_post_load
>> >> and
>> >> ends with runstate_set(RUN_STATE_PAUSED).
>> >>
>> >> I will add a comment here in the code.
>> >>
>> >> >> return 0;
>> >> >> }
>> >> >> @@ -134,6 +143,7 @@ static const VMStateDescription
>> >> >> vmstate_globalstate = {
>> >> >> .fields = (VMStateField[]) {
>> >> >> VMSTATE_UINT32(size, GlobalState),
>> >> >> VMSTATE_BUFFER(runstate, GlobalState),
>> >> >> + VMSTATE_BOOL(vm_was_suspended, GlobalState),
>> >> >> VMSTATE_END_OF_LIST()
>> >> >> },
>> >> >> };
>> >> >
>> >> > I think this will break migration between old/new, unfortunately. And
>> >> > since the global state exist mostly for every VM, all VM setup should be
>> >> > affected, and over all archs.
>> >>
>> >> Thanks, I keep forgetting that my binary tricks are no good here.
>> >> However,
>> >> I have one other trick up my sleeve, which is to store vm_was_running in
>> >> global_state.runstate[strlen(runstate) + 2]. It is forwards and backwards
>> >> compatible, since that byte is always 0 in older qemu. It can be
>> >> implemented
>> >> with a few lines of code change confined to global_state.c, versus many
>> >> lines
>> >> spread across files to do it the conventional way using a compat property
>> >> and
>> >> a subsection. Sound OK?
>> >
>> > Tricky! But sounds okay to me. I think you're inventing some of your own
>> > way of being compatible, not relying on machine type as a benefit. If go
>> > this route please document clearly on the layout and also what it looked
>> > like in old binaries.
>> >
>> > I think maybe it'll be good to keep using strings, so in the new binaries
>> > we allow >1 strings, then we define properly on those strings (index 0:
>> > runstate, existed since start; index 2: suspended, perhaps using "1"/"0" to
>> > express, while 0x00 means old binary, etc.).
>> >
>> > I hope this trick will need less code than the subsection solution,
>> > otherwise I'd still consider going with that, which is the "common
>> > solution".
>> >
>> > Let's also see whether Juan/Fabiano/others has any opinions.
>>
>> Can't we pack the structure and just go ahead and slash 'runstate' in
>> half? That would claim some unused bytes for future backward
>> compatibility issues.
>
> What I meant is something like:
>
> runstate[100] = {"str1", 0x00, "str2", 0x00, ...}
>
> Where str1 is runstate, and str2 can be either "0"/"1" to reflect suspended
> value. We define all the strings separated by 0x00, then IIUC we save the
> most chars for potential future extension of this string.
>
> Thanks,
Right, I got your point. I just think we could avoid designing this new
string format by creating new fields with the extra space:
typedef struct QEMU_PACKED {
uint32_t size;
uint8_t runstate[50];
uint8_t unused[50];
RunState state;
bool received;
} GlobalState;
In my mind this works seamlessly, or am I mistaken?
In any case, a oneshot hack might be better than both our suggestions
because we can just clean it up a couple of releases from now as if
nothing happened.
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Steven Sistare, 2023/12/01
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Peter Xu, 2023/12/04
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Fabiano Rosas, 2023/12/04
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Peter Xu, 2023/12/04
- Re: [PATCH V6 05/14] migration: propagate suspended runstate,
Fabiano Rosas <=
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Peter Xu, 2023/12/04
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Fabiano Rosas, 2023/12/05
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Steven Sistare, 2023/12/05
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Peter Xu, 2023/12/05
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Fabiano Rosas, 2023/12/05
- Re: [PATCH V6 05/14] migration: propagate suspended runstate, Steven Sistare, 2023/12/05
Re: [PATCH V6 05/14] migration: propagate suspended runstate, Steven Sistare, 2023/12/04