qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PULL v1 0/7] MMIO Exec pull request


From: Dr. David Alan Gilbert
Subject: Re: [Qemu-devel] [PULL v1 0/7] MMIO Exec pull request
Date: Fri, 21 Jul 2017 11:31:11 +0100
User-agent: Mutt/1.8.3 (2017-05-23)

* Peter Maydell (address@hidden) wrote:
> On 21 July 2017 at 10:13, Dr. David Alan Gilbert <address@hidden> wrote:
> > I don't fully understand the way memory_region_do_invalidate_mmio_ptr
> > works; I see it dropping the memory region; if that's also dropping
> > the RAMBlock then it will upset migration.   Even if the CPU is stopped
> > I dont think that stops the migration thread walking through the list of
> > RAMBlocks.
> 
> memory_region_do_invalidate_mmio_ptr() calls memory_region_unref(),
> which will eventually result in memory_region_finalize() being
> called, which will call the MR destructor, which in this case is
> memory_region_destructor_ram(), which calls qemu_ram_free() on
> the RAMBlock, which removes the RAMBlock from the list (after
> taking the ramlist lock).

OK

> > Even then, the problem is migration keeps a 'dirty_pages' count which is
> > calculated at the start of migration and updated as we dirty and send
> > pages; if we add/remove a RAMBlock then that dirty_pages count is wrong
> > and we either never finish migration (since dirty_pages never reaches
> > zero) or finish early with some unsent data.
> > And then there's the 'received' bitmap currently being added for
> > postcopy which tracks each page that's been received (that's not in yet
> > though).
> 
> It sounds like we really need to make migration robust against
> RAMBlock changes -- in the hotplug case it's certainly possible
> for RAMBlocks to be newly created or destroyed while migration
> is in progress.

Juan recently added a patch that disables hotplug/unplug during
migration (b06424de) - although we did figure out later that it's
not quite enough because it stops the request to hotunplug, but
one of those might already be in flight.

Adding a RAMBlock isn't too bad a problem, removing one is much hairier.

  a) Somehow it has to be coordinated with the destination - my
  suggestion for this is to send a QMP command down the migration stream
  to do the hot-add.

  b) Migration traverses RAMBlock lists over quite a long time; we can't
  hold an RCU lock for the whole period we're doing migration,
  so the FOREACH_RCU isn't really protecting us from much.
  (We probably should hold ram_list.mutex or the whole migration at
  the moment)

  c) RAMBlocks themselves aren't reference counted

  d) We've got 2ndary structures such as that counter, a queue of
  requests, bitmaps etc - that are all either based off RAMBlock list
  data or actually contain pointers to RAMBlocks

  e) Then the RAMBlocks are registered with the kernel for dirty
  tracking.

None of those feel easily fixable.

Dave


> thanks
> -- PMM
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]