qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v5 0/3] migration: reduce time of loading non-iterable vmstate


From: Chuang Xu
Subject: Re: [RFC v5 0/3] migration: reduce time of loading non-iterable vmstate
Date: Mon, 20 Feb 2023 21:36:31 +0800
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1

Hi, Peter

On 2023/2/17 下午11:52, Peter Xu wrote:
Hello, Chuang,

On Fri, Feb 17, 2023 at 04:11:19PM +0800, Chuang Xu wrote:
Error 1 was triggered by our sanity check. I try to add RCU_READ_LOCK_GUARD()
in address_space_init() and it works. But I'm not sure if this code change is
appropriate. If this change is not appropriate, we may need to consider other
sanity check.
I'd suggest not adding RCU locks without a good reason.

address_space_init() is definitely a special context because the AS is
exclusively owned by the caller before it returns.  It means no RCU
protection needed at all because no one else is touching it; neither do we
need qatomic_rcu_read() when read.

So I suggest we directly reference current_map, even though that'll need a
rich comment:

  static void address_space_set_flatview(AddressSpace *as)
  {
-    FlatView *old_view = address_space_to_flatview(as);
+    /*
+     * NOTE: we don't use RCU flavoured of address_space_to_flatview()
+     * because we exclusively own as->current_map here: it's either during
+     * init of an address space, or during commit() with BQL held.
+     */
+    FlatView *old_view = as->current_map;

We can have address_space_to_flatview_raw() but since we'll directly modify
as->current_map very soon in the same function, so may not even bother.

I agree with you.

But now I am facing a new problem. Our sanity check is not as reliable as we 
think.

Although my current code can pass all the tests that Juan told me in the email.
But if I configure with nothing and run 'make check', My patch triggers
error in ahci migrate test:

G_TEST_DBUS_DAEMON=/data00/migration/qemu-open/tests/dbus-vmstate-daemon.sh 
QTEST_QEMU_BINARY=./qemu-system-x86_64 QTEST_QEMU_IMG=./qemu-img 
MALLOC_PERTURB_=1 
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
/data00/migration/qemu-open/build/tests/qtest/ahci-test --tap -k
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
 ✀  
――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
stderr:
qemu-system-x86_64: AHCI: Failed to start FIS receive engine: bad FIS receive 
buffer address
qemu-system-x86_64: Failed to load ich9_ahci:ahci
qemu-system-x86_64: error while loading state for instance 0x0 of device 
'0000:00:1f.2/ich9_ahci'
qemu-system-x86_64: load of migration failed: Operation not permitted
Migration did not complete, status: failed

It seems that ahci_state_post_load() will call memory_access_is_direct() and 
access mr->ram.
Due to mr transaction delay, this value doesn't meet expectations.
And Here is the call chain for you to read the code:
->ahci_state_post_load
->ahci_cond_start_engines
->ahci_map_fis_address
->map_page
->dma_memory_map
->address_space_map
->memory_access_is_direct

I think we need a memory_region_transaction_commit_force() to force commit
some transactions when load vmstate. This function is designed like this:

|
/* memory_region_transaction_commit_force() is desgined to * force the mr transaction to be commited in the process * of loading vmstate. */ void memory_region_transaction_commit_force(void) { AddressSpace *as; unsigned int memory_region_transaction_depth_copy = memory_region_transaction_depth; /* Temporarily replace memory_region_transaction_depth with 0 to prevent * memory_region_transaction_commit_force() and address_space_to_flatview() * call each other recursively. */ memory_region_transaction_depth = 0; assert(qemu_mutex_iothread_locked()); if (memory_region_update_pending) { flatviews_reset(); MEMORY_LISTENER_CALL_GLOBAL(begin, Forward); QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) { address_space_set_flatview(as); address_space_update_ioeventfds(as); } memory_region_update_pending = false; ioeventfd_update_pending = false; MEMORY_LISTENER_CALL_GLOBAL(commit, Forward); } else if (ioeventfd_update_pending) { QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) { address_space_update_ioeventfds(as); } ioeventfd_update_pending = false; } /* recover memory_region_transaction_depth */ memory_region_transaction_depth = memory_region_transaction_depth_copy; }
Now there are two options to use this function:
1. call it in address_space_to_flatview():

|
static inline FlatView *address_space_to_flatview(AddressSpace *as) {
/* * Before using any flatview, check whether we're during a memory * region transaction. If so, force commit the memory transaction. */ if (memory_region_transaction_in_progress()) memory_region_transaction_commit_force();
return qatomic_rcu_read(&as->current_map); }
2. call it before each post_load()
I prefer to use the former one, it is more general than the latter.
And with this function, the sanity check is not necessary any more.
Although we may inevitably call memory_region_transaction_commit_force()||||
several times, in my actual test, the number of calls to ||
|memory_region_transaction_commit_force|() is still insignificant compared
with the number of calls to|memory_region_transaction_commit()| before
optimization. As a result, This code won't affect the optimization effect,
but it can ensure reliability. |

|

Error 2 was related to postcopy. I read the official document of postcopy
(I hope it is the latest) and learned that two threads will call
qemu_loadvm_state_main() in the process of postcopy. The one called by main 
thread
will take the BQL, and the one called by ram_listen thread won't take the BQL.
The latter checks whether the BQL is held when calling 
memory_region_transaction_commit(),
thus triggering the assertion. Creating a new function 
qemu_loadvm_state_ram_listen()
without memory_region_transaction_commit() will solve this error.
Sounds right, because the whole qemu_loadvm_state_main() process shouldn't
load any device state or anything that requires BQL at all; in most cases
that should be only RAM states leftovers.

I think we only want to optimize precopy but not the postcopy phase. Note!
it should include the phase when transferring precopy -> postcopy too, so
it's covering postcopy, just not covering the "post" phase of migration -
if you see that's a nested call to qemu_loadvm_state_main() with a whole
MIG_CMD_PACKAGED package which is actually got covered, which is the real
meat for postcopy on device transitions.

So in short: instead of creating qemu_loadvm_state_ram_listen(), how about
modifying your patch 3, instead of changing inside qemu_loadvm_state_main()
we can do that for qemu_loadvm_state() only (so you can wrap the begin()
and commit() over qemu_loadvm_state_main() there)?

In Patch v6 I'll adopt the change you suggested, Thanks!

I don't know if you suggest using this patch in postcopy. If this patch is 
applicable to
postcopy, considering the difference between how postcopy and precheck load 
device state,
do we need to consider more details?
See above.  Yes I definitely hope postcopy will be covered too.

Thanks!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]