On 09/24/2015 08:53 PM, Denis V. Lunev wrote:
From: Igor Redko <address@hidden>
Release qemu global mutex before call synchronize_rcu().
synchronize_rcu() waiting for all readers to finish their critical
sections. There is at least one critical section in which we try
to get QGM (critical section is in address_space_rw() and
prepare_mmio_access() is trying to aquire QGM).
Both functions (migration_end() and migration_bitmap_extend())
are called from main thread which is holding QGM.
Thus there is a race condition that ends up with deadlock:
main thread working thread
Lock QGA |
| Call KVM_EXIT_IO handler
| |
| Open rcu reader's critical section
Migration cleanup bh |
| |
synchronize_rcu() is |
waiting for readers |
| prepare_mmio_access() is waiting for QGM
\ /
deadlock
The patch just releases QGM before calling synchronize_rcu().
Signed-off-by: Igor Redko <address@hidden>
Reviewed-by: Anna Melekhova <address@hidden>
Signed-off-by: Denis V. Lunev <address@hidden>
CC: Juan Quintela <address@hidden>
CC: Amit Shah <address@hidden>
---
migration/ram.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/migration/ram.c b/migration/ram.c
index 7f007e6..d01febc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1028,12 +1028,16 @@ static void migration_end(void)
{
/* caller have hold iothread lock or is in a bh, so there is
* no writing race against this migration_bitmap
+ * but rcu used not only for migration_bitmap, so we should
+ * release QGM or we get in deadlock.
*/
unsigned long *bitmap = migration_bitmap;
atomic_rcu_set(&migration_bitmap, NULL);
if (bitmap) {
memory_global_dirty_log_stop();
+ qemu_mutex_unlock_iothread();
synchronize_rcu();
+ qemu_mutex_lock_iothread();
migration_end() can called in two cases:
1. migration_completed
2. migration is cancelled
In case 1, you should not unlock iothread, otherwise, the vm's state may be
changed
unexpectedly.