[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failed becauseof vho
From: |
peng.hao2 |
Subject: |
Re: [Qemu-devel] [PATCH V2] vhost: fix a migration failed becauseof vhost region merge |
Date: |
Fri, 28 Jul 2017 22:21:46 +0800 (CST) |
>On Wed, 26 Jul 2017 19:01:39 +0300
>"Michael S. Tsirkin" <address@hidden> wrote:
>> On Wed, Jul 26, 2017 at 04:05:43PM +0200, Igor Mammedov wrote:
>> > On Tue, 25 Jul 2017 22:47:18 +0300
>> > "Michael S. Tsirkin" <address@hidden> wrote:
>> >
>> > > On Tue, Jul 25, 2017 at 10:44:38AM +0200, Igor Mammedov wrote:
>> > > > On Mon, 24 Jul 2017 23:50:00 +0300
>> > > > "Michael S. Tsirkin" <address@hidden> wrote:
>> > > >
>> > > > > On Mon, Jul 24, 2017 at 11:14:19AM +0200, Igor Mammedov wrote:
>> > > > > > On Sun, 23 Jul 2017 20:46:11 +0800
>> > > > > > Peng Hao <address@hidden> wrote:
>> > > > > >
>> > > > > > > When a guest that has several hotplugged dimms is migrated, on
>> > > > > > > destination it will fail to resume. Because regions on source
>> > > > > > > are merged and on destination the order of realizing devices
>> > > > > > > is different from on source with dimms, so when part of devices
>> > > > > > > are realizd some region can not be merged.That may be more than
>> > > > > > > vhost slot limit.
>> > > > > > >
>> > > > > > > Signed-off-by: Peng Hao <address@hidden>
>> > > > > > > Signed-off-by: Wang Yechao <address@hidden>
>> > > > > > > ---
>> > > > > > > hw/mem/pc-dimm.c | 2 +-
>> > > > > > > include/sysemu/sysemu.h | 1 +
>> > > > > > > vl.c | 5 +++++
>> > > > > > > 3 files changed, 7 insertions(+), 1 deletion(-)
>> > > > > > >
>> > > > > > > diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
>> > > > > > > index ea67b46..13f3db5 100644
>> > > > > > > --- a/hw/mem/pc-dimm.c
>> > > > > > > +++ b/hw/mem/pc-dimm.c
>> > > > > > > @@ -101,7 +101,7 @@ void pc_dimm_memory_plug(DeviceState *dev,
>> > > > > > > MemoryHotplugState *hpms,
>> > > > > > > goto out
>> > > > > > > }
>> > > > > > >
>> > > > > > > - if (!vhost_has_free_slot()) {
>> > > > > > > + if (!vhost_has_free_slot() && qemu_is_machine_init_done()) {
>> > > > > > > error_setg(&local_err, "a used vhost backend has no
>> > > > > > > free"
>> > > > > > > " memory slots left")
>> > > > > > that doesn't fix issue,
>> > > > > > 1st: number of used entries is changing after
>> > > > > > machine_init_done() is called
>> > > > > > as regions continue to mapped/unmapped during runtime
>> > > > >
>> > > > > But that's fine, we want hotplug to fail if we can not guarantee
>> > > > > vhost
>> > > > > will work.
>> > > > don't we want guarantee that vhost will work with dimm devices at
>> > > > startup
>> > > > if it were requested on CLI or fail startup cleanly if it can't?
>> > >
>> > > Yes. And failure to start vhost will achieve this without need to much
>> > > with
>> > > DIMMs. The issue is only with DIMM hotplug when vhost is already running,
>> > > specifically because notifiers have no way to report or handle errors.
>> > >
>> > > > > > 2nd: it brings regression and allows to start QEMU with number
>> > > > > > memory
>> > > > > > regions more than supported by backend, which combined
>> > > > > > with missing
>> > > > > > error handling in vhost will lead to qemu crashes or
>> > > > > > obscure bugs in
>> > > > > > guest breaking vhost enabled drivers.
>> > > > > > i.e. patch undoes what were fixed by
>> > > > > >
>> > > > > > https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg00789.html
>> > > > > >
>> > > > >
>> > > > > Why does it? The issue you fixed there is hotplug, and that means
>> > > > > pc_dimm_memory_plug called after machine done.
>> > > > I wasn't able to crash fc24 guest with current qemu/rhen7 kernel,
>> > > > it fallbacks back to virtio and switches off vhost.
>> > >
>> > > I think vhostforce should make vhost fail and not fall back,
>> > > but that is another bug.
>> > currently vhostforce is broken, qemu continues to happily run with this
>> > patch
>> > and without patch it fails to start up so I'd just NACK this patch
>> > on this behavioral change and ask to fix both issues in the same series.
>>
>> Please do not send nacks. They are not really helpful.
>>
>> Ack is like +1. You save some space since all you are saying is "all's
>> well". But if there's an issue you want to explain what it is 99% of the
>> time. So nack does not save any space and just pushes contributors away.
>> Especially if it's in all caps, that's just against netiquette.
>I'm sorry to the author if it were taken as offense,
>an intent was to say that by itself patch allows to start QEMU in invalid
>configuration and that it should be fixed as well.
>Anyway I've just posted an alternative patch that should workaround issue
>at the hand while not removing check:
>[PATCH for 2.10] pc: make 'pc.rom' readonly when machine has PCI enabled
>Peng Hao,
>could you check if it solves the problem for you
yes,it works. I never think of it before.
Thanks. Mst,too.
.