[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 3/3] dump-guest-memory: Block live migration
From: |
Peter Xu |
Subject: |
Re: [PATCH v2 3/3] dump-guest-memory: Block live migration |
Date: |
Fri, 27 Aug 2021 10:50:33 -0400 |
On Fri, Aug 27, 2021 at 11:51:04AM +0400, Marc-André Lureau wrote:
> Hi
>
> On Fri, Aug 27, 2021 at 11:44 AM Marc-André Lureau <
> marcandre.lureau@redhat.com> wrote:
>
> > Hi
> >
> > On Thu, Aug 26, 2021 at 10:58 PM Peter Xu <peterx@redhat.com> wrote:
> >
> >> Both dump-guest-memory and live migration caches vm state at the
> >> beginning.
> >> Either of them entering the other one will cause race on the vm state,
> >> and even
> >> more severe on that (please refer to the crash report in the bug link).
> >>
> >> Let's block live migration in dump-guest-memory, and that'll also block
> >> dump-guest-memory if it detected that we're during a live migration.
> >>
> >> Side note: migrate_del_blocker() can be called even if the blocker is not
> >> inserted yet, so it's safe to unconditionally delete that blocker in
> >> dump_cleanup (g_slist_remove allows no-entry-found case).
> >>
> >> Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1996609
> >> Signed-off-by: Peter Xu <peterx@redhat.com>
> >> ---
> >> dump/dump.c | 24 +++++++++++++++++++-----
> >> 1 file changed, 19 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/dump/dump.c b/dump/dump.c
> >> index ab625909f3..9c1c1fb738 100644
> >> --- a/dump/dump.c
> >> +++ b/dump/dump.c
> >> @@ -29,6 +29,7 @@
> >> #include "qemu/error-report.h"
> >> #include "qemu/main-loop.h"
> >> #include "hw/misc/vmcoreinfo.h"
> >> +#include "migration/blocker.h"
> >>
> >> #ifdef TARGET_X86_64
> >> #include "win_dump.h"
> >> @@ -47,6 +48,8 @@
> >>
> >> #define MAX_GUEST_NOTE_SIZE (1 << 20) /* 1MB should be enough */
> >>
> >> +static Error *dump_migration_blocker;
> >> +
> >> #define ELF_NOTE_SIZE(hdr_size, name_size, desc_size) \
> >> ((DIV_ROUND_UP((hdr_size), 4) + \
> >> DIV_ROUND_UP((name_size), 4) + \
> >> @@ -101,6 +104,7 @@ static int dump_cleanup(DumpState *s)
> >> qemu_mutex_unlock_iothread();
> >> }
> >> }
> >> + migrate_del_blocker(dump_migration_blocker);
> >>
> >> return 0;
> >> }
> >> @@ -1927,11 +1931,6 @@ void qmp_dump_guest_memory(bool paging, const char
> >> *file,
> >> Error *local_err = NULL;
> >> bool detach_p = false;
> >>
> >> - if (runstate_check(RUN_STATE_INMIGRATE)) {
> >> - error_setg(errp, "Dump not allowed during incoming migration.");
> >> - return;
> >> - }
> >> -
> >> /* if there is a dump in background, we should wait until the dump
> >> * finished */
> >> if (dump_in_progress()) {
> >> @@ -2005,6 +2004,21 @@ void qmp_dump_guest_memory(bool paging, const char
> >> *file,
> >> return;
> >> }
> >>
> >> + if (!dump_migration_blocker) {
> >> + error_setg(&dump_migration_blocker,
> >> + "Live migration disabled: dump-guest-memory in
> >> progress");
> >> + }
> >> +
> >> + /*
> >> + * Allows even for -only-migratable, but forbid migration during the
> >> + * process of dump guest memory.
> >> + */
> >> + if (migrate_add_blocker_internal(dump_migration_blocker, errp)) {
> >> + /* Remember to release the fd before passing it over to dump
> >> state */
> >> + close(fd);
> >> + return;
> >> + }
> >>
> >
> > I would move it earlier. Why not leave it at the beginning of the
> > function as it was?
> >
> >
> Ah I think it's because dump_cleanup() isn't called when returning earlier.
> But relying on a cleanup done outside of this function is not obvious.
> Either dump_cleanup() should be called from here, or the blocker code
> should be moved to dump_init() imho.
Note that dump_cleanup() cleans s->fd, however fd has not delievered to s->fd
now.
I can move it into dump_init() but I need to put it after setting s->fd to make
dump_cleanup work. That could look a bit weird (as checking migration blocker
shouldn't really be bound to initialize of dump states).
There's another way to move add blocker to the entry of dump_in_progress()
check, however then all the checks between dump_in_progress() and before
opening the fd will not be able to "return" directly but we'll need to del the
blocker if any of those check fails.
That's how I come up with the current patch, which looks the easiest change and
still looks clean.
Besides, I think dump_in_progress() can be removed too with patch 1 checks
SAVEVM state too, however that's a side effect of migration_is_idle() and
dump_in_progres is doing the check in different ways by looking up dump state,
so I kept it.
Thanks,
--
Peter Xu