qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON


From: Peter Xu
Subject: Re: [PATCH v1 2/2] virtio-balloon: disallow postcopy with VIRTIO_BALLOON_F_FREE_PAGE_HINT
Date: Wed, 7 Jul 2021 18:40:37 -0400

On Wed, Jul 07, 2021 at 02:22:32PM -0700, Alexander Duyck wrote:
> On Wed, Jul 7, 2021 at 1:08 PM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Wed, Jul 07, 2021 at 08:57:29PM +0200, David Hildenbrand wrote:
> > > On 07.07.21 20:02, Peter Xu wrote:
> > > > On Wed, Jul 07, 2021 at 04:06:55PM +0200, David Hildenbrand wrote:
> > > > > As it never worked properly, let's disable it via the postcopy 
> > > > > notifier on
> > > > > the destination. Trying to set "migrate_set_capability postcopy-ram 
> > > > > on"
> > > > > on the destination now results in "virtio-balloon: 'free-page-hint' 
> > > > > does
> > > > > not support postcopy Error: Postcopy is not supported".
> > > >
> > > > Would it be possible to do this in reversed order?  Say, dynamically 
> > > > disable
> > > > free-page-hinting if postcopy capability is set when migration starts? 
> > > > Perhaps
> > > > it can also be re-enabled automatically when migration completes?
> > >
> > > I remember that this might be quite racy. We would have to make sure that 
> > > no
> > > hinting happens before we enable the capability.
> > >
> > > As soon as we messed with the dirty bitmap (during precopy), postcopy is 
> > > no
> > > longer safe. As noted in the patch, the only runtime alternative is to
> > > disable postcopy as soon as we actually do clear a bit. Alternatively, we
> > > could ignore any hints if the postcopy capability was enabled.
> >
> > Logically migration capabilities are applied at VM starts, and these
> > capabilities should be constant during migration (I didn't check if there's 
> > a
> > hard requirement; easy to add that if we want to assure it), and in most 
> > cases
> > for the lifecycle of the vm.
> 
> Would it make sense to maybe just look at adding a postcopy value to
> the PrecopyNotifyData that you could populate with
> migration_in_postcopy() in precopy_notify()?

Should we check migrate_postcopy_ram() rather than migration_in_postcopy()?
It's the precopy phase that's dropping the dirty bits and can potentially hang
a postcopy vcpu, afaiu.

> 
> Then all you would need to do is check for that value and if it is set
> you shut down the page hinting or don't start it since I suspect it
> wouldn't likely add any value anyway since I would think flagging
> unused pages doesn't add much value in a postcopy environment anyway.
> 
> > >
> > > Whatever we do, we have to make sure that a user cannot trick the system
> > > into an inconsistent state. Like enabling hinting, starting migration, 
> > > then
> > > enabling the postcopy capability and kicking of postcopy. I did not check 
> > > if
> > > we allow for that, though.
> >
> > We could turn free page hinting off when migration starts with 
> > postcopy-ram=on,
> > then re-enable it after migration finishes.  That looks very safe to me.  
> > And I
> > don't even worry on user trying to mess it up - as that only put their own 
> > VM
> > at risk; that's mostly fine to me.
> 
> We wouldn't necessarily even need to really turn it off, just don't
> start it. I wonder if we couldn't just get away with adding a check to
> the existing virtio_balloon_free_page_hint_notify to see if we are in
> the postcopy state there and just shut things down or not start them.

This makes me wonder whether qemu_guest_free_page_hint() should be called at
all on destination host when incoming postcopy migration is in progress.

Right now the check migration_is_setup_or_active() should return true on
destination host, however I am not sure if that's necessary as we don't track
dirty at all there.

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]