qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v5 45/45] Inhibit ballooning during postcopy


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PATCH v5 45/45] Inhibit ballooning during postcopy
Date: Mon, 23 Mar 2015 12:21:50 +0000
User-agent: Mutt/1.5.23 (2014-03-12)

* David Gibson (address@hidden) wrote:
> On Wed, Feb 25, 2015 at 04:52:08PM +0000, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <address@hidden>
> > 
> > The userfault mechanism used for postcopy generates faults
> > for us on pages that are 'not present', inflating a balloon in
> > the guest causes host pages to be marked as 'not present'; doing
> > this during a postcopy, as potentially the same pages were being
> > received from the source, would confuse the state of the received
> > page -> disable ballooning during postcopy.
> 
> That is a ludicrously long sentence, which I have great difficulty parsing.

OK, how about:

-----
Postcopy detects accesses to pages that haven't been transferred yet
using userfaultfd, and it causes exceptions on pages that are 'not present'.
Ballooning also causes pages to be marked as 'not present' when the guest
inflates the balloon.
Potentially a balloon could be inflated to discard pages that are currently
inflight during postcopy and that may be arriving at about the same time.

To avoid this confusion, disable ballooning during postcopy.

-----

> > When disabled we drop balloon requests from the guest.  Since ballooning
> > is generally initiated by the host, the management system should avoid
> > initiating any balloon instructions to the guest during migration,
> > although it's not possible to know how long it would take a guest to
> > process a request made prior to the start of migration.
> 
> Yeah :/.  It would be nice if it could queue the guest actions,
> instead of dropping them.

Yes, I did look at that briefly; it's not trivial; for
example consider the situation where the guest discards some pages
by inflating, and then later deflates, it expects to lose that data
but then starts accessing that physical page again.  
If you replay that sequence at the end then you've lost newly accessed pages.
So you have to filter out inflates that have been deflated later,
and have to order those correctly with the sense of changes made to those
pages after the deflation occurs.

Dave

> 
> > 
> > Signed-off-by: Dr. David Alan Gilbert <address@hidden>
> > ---
> >  balloon.c                  | 11 +++++++++++
> >  hw/virtio/virtio-balloon.c |  4 +++-
> >  include/sysemu/balloon.h   |  2 ++
> >  migration/postcopy-ram.c   |  9 +++++++++
> >  4 files changed, 25 insertions(+), 1 deletion(-)
> > 
> > diff --git a/balloon.c b/balloon.c
> > index dea19a4..faedb60 100644
> > --- a/balloon.c
> > +++ b/balloon.c
> > @@ -35,6 +35,17 @@
> >  static QEMUBalloonEvent *balloon_event_fn;
> >  static QEMUBalloonStatus *balloon_stat_fn;
> >  static void *balloon_opaque;
> > +static bool balloon_inhibited;
> > +
> > +bool qemu_balloon_is_inhibited(void)
> > +{
> > +    return balloon_inhibited;
> > +}
> > +
> > +void qemu_balloon_inhibit(bool state)
> > +{
> > +    balloon_inhibited = state;
> > +}
> >  
> >  static bool have_ballon(Error **errp)
> >  {
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 7bfbb75..b0e94ee 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> > @@ -36,9 +36,11 @@
> >  static void balloon_page(void *addr, int deflate)
> >  {
> >  #if defined(__linux__)
> > -    if (!kvm_enabled() || kvm_has_sync_mmu())
> > +    if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
> > +                                         kvm_has_sync_mmu())) {
> >          qemu_madvise(addr, TARGET_PAGE_SIZE,
> >                  deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
> > +    }
> >  #endif
> >  }
> >  
> > diff --git a/include/sysemu/balloon.h b/include/sysemu/balloon.h
> > index 0345e01..6851d99 100644
> > --- a/include/sysemu/balloon.h
> > +++ b/include/sysemu/balloon.h
> > @@ -23,5 +23,7 @@ typedef void (QEMUBalloonStatus)(void *opaque, 
> > BalloonInfo *info);
> >  int qemu_add_balloon_handler(QEMUBalloonEvent *event_func,
> >                          QEMUBalloonStatus *stat_func, void *opaque);
> >  void qemu_remove_balloon_handler(void *opaque);
> > +bool qemu_balloon_is_inhibited(void);
> > +void qemu_balloon_inhibit(bool state);
> >  
> >  #endif
> > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > index d8f5ccd..b9f5848 100644
> > --- a/migration/postcopy-ram.c
> > +++ b/migration/postcopy-ram.c
> > @@ -24,6 +24,7 @@
> >  #include "migration/migration.h"
> >  #include "migration/postcopy-ram.h"
> >  #include "sysemu/sysemu.h"
> > +#include "sysemu/balloon.h"
> >  #include "qemu/bitmap.h"
> >  #include "qemu/error-report.h"
> >  #include "trace.h"
> > @@ -531,6 +532,8 @@ int 
> > postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
> >          mis->have_fault_thread = false;
> >      }
> >  
> > +    qemu_balloon_inhibit(false);
> > +
> >      if (enable_mlock) {
> >          if (os_mlock() < 0) {
> >              error_report("mlock: %s", strerror(errno));
> > @@ -780,6 +783,12 @@ int postcopy_ram_enable_notify(MigrationIncomingState 
> > *mis)
> >          return -1;
> >      }
> >  
> > +    /*
> > +     * Ballooning can mark pages as absent while we're postcopying
> > +     * that would cause false userfaults.
> > +     */
> > +    qemu_balloon_inhibit(true);
> > +
> >      trace_postcopy_ram_enable_notify();
> >  
> >      return 0;
> 
> -- 
> David Gibson                  | I'll have my music baroque, and my code
> david AT gibson.dropbear.id.au        | minimalist, thank you.  NOT _the_ 
> _other_
>                               | _way_ _around_!
> http://www.ozlabs.org/~dgibson


--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]