[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH v2 1/1] s390x/css: unrestrict cssids
From: |
Dong Jia Shi |
Subject: |
Re: [Qemu-devel] [RFC PATCH v2 1/1] s390x/css: unrestrict cssids |
Date: |
Thu, 30 Nov 2017 11:16:24 +0800 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
* Halil Pasic <address@hidden> [2017-11-29 17:30:15 +0100]:
>
>
> On 11/29/2017 12:47 PM, Cornelia Huck wrote:
> > On Wed, 29 Nov 2017 16:17:35 +0800
> > Dong Jia Shi <address@hidden> wrote:
> >
> >> * Halil Pasic <address@hidden> [2017-11-28 14:07:58 +0100]:
> >>
> >> [...]
> >>> The auto-generated bus ids are affected by both changes. We hope to not
> >>> encounter any auto-generated bus ids in production as Libvirt is always
> >>> explicit about the bus id. Since 8ed179c937 ("s390x/css: catch section
> >>> mismatch on load", 2017-05-18) the worst that can happen because the same
> >>> device ended up having a different bus id is a cleanly failed migration.
> >>> I find it hard to reason about the impact of changed auto-generated bus
> >>> ids on migration for command line users as I don't know which rules is
> >>> such an user supposed to follow.
> >> For this paragraph, Halil pointed to me a case that he is thinking of.
> >> 1. VM configuration with 3 devices:
> >> -device virtio (e.g. virtio-blk-ccw,id=disk0)
> >> -device vfio-ccw (e.g. id=vfio0)
> >> -device virtio (e.g. virtio-rng-ccw,id=rng0)
> >> 2. Start the vm.
> >> 3. device_del vfio0
> >> 4. migrate "exec:gzip -c > /tmp/tmp_vmstate.gz"
> >> 5. modify cmd line from step 1 by removing the vfio0 device, and adding:
> >> -incoming "exec:gzip -c -d /tmp/tmp_vmstate.gz"
> >>
> >> Let me list my test results here for everybody's reference.
> >>
> >> W/o this patch
> >> ==============
> >>
> >> ------------+---------------+-------------
> >> | squashing off | squashing on
> >> ------------+---------------+-------------
> >> auto id | F | F
> >> ------------+---------------+-------------
> >> explicit id | F | S
> >> ------------+---------------+-------------
> >>
> >> T1. squashing off + auto id
> >> qemu-system-s390x: vmstate: get_nullptr expected VMS_NULLPTR_MARKER
> >> qemu-system-s390x: Failed to load s390_css:css
> >> qemu-system-s390x: error while loading state for instance 0x0 of device
> >> 's390_css'
> >> qemu-system-s390x: load of migration failed: Invalid argument
> >> [Fail due to css mismatch - there is no css 0 in the new vm.]
> >>
> >> T2. squashing off + explicit given id
> >> qemu-system-s390x: vmstate: get_nullptr expected VMS_NULLPTR_MARKER
> >> qemu-system-s390x: Failed to load s390_css:css
> >> qemu-system-s390x: error while loading state for instance 0x0 of device
> >> 's390_css'
> >> qemu-system-s390x: load of migration failed: Invalid argument
> >> [Fail due to css mismatch - there is no css 0 in the new vm.]
> > Hmm... so should we even try to migrate an empty css 0? It only exists
> > because we have created a device that we had to detach anyway because
> > it was non-migrateable...
> >
> > [Probably no easy way to deal with this, though.]
> >
>
> We could make the thing go away when the last device is gone.
Is it possible to free the empty css in a .pre_save handler somewhere?
> I see a general problem with implicitly generated shared stuff.
>
> Obviously we can't fix the past.
Nod.
>
> @Dong Jia:
>
> Thanks for doing the experiments and publishing your findings.
>
Just want to ease the review. No need mention. :)
--
Dong Jia Shi