[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 0/5] migrations: Fix potential rare race of migration-test
From: |
Lukas Straub |
Subject: |
Re: [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank |
Date: |
Sat, 24 Jul 2021 17:21:44 +0000 |
On Thu, 22 Jul 2021 13:58:36 -0400
Peter Xu <peterx@redhat.com> wrote:
> v3:
> - Use WITH_QEMU_LOCK_GUARD() for patch 2 [Eric]
> (potentially I can also replace other existing uses of qemu_file_lock into
> WITH_QEMU_LOCK_GUARD, but I decided to took Dave's r-b first and leave that
> for later)
> - Added r-bs for Dave on patch 2/4
> - Add a comment in patch 5 to explain why it's safe to unregister yank without
> qemu_file_lock, in postcopy_pause() [Dave]
>
> v2:
> - Pick r-b for Dave on patch 1/3
> - Move migration_file_get_ioc() from patch 5 to patch 4, meanwhile rename it
> to
> qemu_file_get_ioc(). [Dave]
> - Patch 2 "migration: Shutdown src in await_return_path_close_on_source()" is
> replaced by patch "migration: Make from_dst_file accesses thread-safe"
> [Dave]
>
> Patch 1 fixes a possible race that migration thread can accidentally skip
> join() of rp_thread even if the return thread is enabled. Patch 1 is
> suspected
> to also be the root cause of the recent hard-to-reproduce migration-test
> failure here reported by PMM:
>
> https://lore.kernel.org/qemu-devel/YPamXAHwan%2FPPXLf@work-vm/
>
> I didn't reproduce it myself; but after co-debugged with Dave it's suspected
> that the race of rp_thread could be the cause. It's not exposed before
> because
> yank is soo strict on releasing instances, while we're not that strict before,
> and didn't join() on rp_thread wasn't so dangerous after all when migration
> succeeded before.
>
> Patch 2 fixes another theoretical race on accessing from_dst_file spotted by
> Dave. I don't think there's known issues with it, but may still worth fixing.
>
> Patch 3 should be a cleanup on yank that I think would be nice to have.
>
> Patch 4-5 are further cleanups to remove the ref==1 check in channel_close(),
> finally, as I always thought that's a bit hackish. So I used explicit
> unregister of the yank function at necessary places to replace that ref==1
> one.
>
> I still think having patch 3-5 altogether would be great, however I think
> patch
> 1 should still be the most important to be reviewed. Also it would be great
> to
> know whether patch 1 could fix the known yank crash.
>
> Please review, thanks.
>
> Peter Xu (5):
> migration: Fix missing join() of rp_thread
> migration: Make from_dst_file accesses thread-safe
> migration: Introduce migration_ioc_[un]register_yank()
> migration: Teach QEMUFile to be QIOChannel-aware
> migration: Move the yank unregister of channel_close out
>
> migration/channel.c | 15 ++-------
> migration/migration.c | 57 +++++++++++++++++++++++++++--------
> migration/migration.h | 15 +++++++--
> migration/multifd.c | 8 ++---
> migration/qemu-file-channel.c | 11 ++-----
> migration/qemu-file.c | 17 ++++++++++-
> migration/qemu-file.h | 4 ++-
> migration/ram.c | 3 +-
> migration/savevm.c | 11 +++++--
> migration/yank_functions.c | 42 ++++++++++++++++++++++++++
> migration/yank_functions.h | 3 ++
> 11 files changed, 138 insertions(+), 48 deletions(-)
>
Looks good to me,
Reviewed-by: Lukas Straub <lukasstraub2@web.de>
--
pgpVAhZChbXYC.pgp
Description: OpenPGP digital signature
- [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank, Peter Xu, 2021/07/22
- [PATCH v3 1/5] migration: Fix missing join() of rp_thread, Peter Xu, 2021/07/22
- [PATCH v3 2/5] migration: Make from_dst_file accesses thread-safe, Peter Xu, 2021/07/22
- [PATCH v3 3/5] migration: Introduce migration_ioc_[un]register_yank(), Peter Xu, 2021/07/22
- [PATCH v3 4/5] migration: Teach QEMUFile to be QIOChannel-aware, Peter Xu, 2021/07/22
- [PATCH v3 5/5] migration: Move the yank unregister of channel_close out, Peter Xu, 2021/07/22
- Re: [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank,
Lukas Straub <=
- Re: [PATCH v3 0/5] migrations: Fix potential rare race of migration-test after yank, Dr. David Alan Gilbert, 2021/07/26