qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1] docs/devel: Add VFIO device migration documentation


From: Kirti Wankhede
Subject: Re: [PATCH v1] docs/devel: Add VFIO device migration documentation
Date: Wed, 4 Nov 2020 01:18:12 +0530
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1



On 10/30/2020 12:35 AM, Alex Williamson wrote:
On Thu, 29 Oct 2020 23:11:16 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:


<snip>

+System memory dirty pages tracking
+----------------------------------
+
+A ``log_sync`` memory listener callback is added to mark system memory pages

s/is added to mark/marks those/
+as dirty which are used for DMA by VFIO device. Dirty pages bitmap is queried

s/by/by the/
s/Dirty/The dirty/
+per container. All pages pinned by vendor driver through vfio_pin_pages()

s/by/by the/
+external API have to be marked as dirty during migration. When there are CPU
+writes, CPU dirty page tracking can identify dirtied pages, but any page pinned
+by vendor driver can also be written by device. There is currently no device

s/by/by the/ (x2)
+which has hardware support for dirty page tracking. So all pages which are
+pinned by vendor driver are considered as dirty.
+Dirty pages are tracked when device is in stop-and-copy phase because if pages
+are marked dirty during pre-copy phase and content is transfered from source to
+destination, there is no way to know newly dirtied pages from the point they
+were copied earlier until device stops. To avoid repeated copy of same content,
+pinned pages are marked dirty only during stop-and-copy phase.


Let me take a quick stab at rewriting this paragraph (not sure if I
understood it correctly):

"Dirty pages are tracked when the device is in the stop-and-copy phase.
During the pre-copy phase, it is not possible to distinguish a dirty
page that has been transferred from the source to the destination from
newly dirtied pages, which would lead to repeated copying of the same
content. Therefore, pinned pages are only marked dirty during the
stop-and-copy phase." ?

I think above rephrase only talks about repeated copying in pre-copy
phase. Used "copied earlier until device stops" to indicate both
pre-copy and stop-and-copy till device stops.


Now I'm confused, I thought we had abandoned the idea that we can only
report pinned pages during stop-and-copy.  Doesn't the device needs to
expose its dirty memory footprint during the iterative phase regardless
of whether that causes repeat copies?  If QEMU iterates and sees that
all memory is still dirty, it may have transferred more data, but it
can actually predict if it can achieve its downtime tolerances.  Which
is more important, less data transfer or predictability?  Thanks,


Even if QEMU copies and transfers content of all sys mem pages during pre-copy (worst case with IOMMU backed mdev device when its vendor driver is not smart to pin pages explicitly and all sys mem pages are marked dirty), then also its prediction about downtime tolerance will not be correct, because during stop-and-copy again all pages need to be copied as device can write to any of those pinned pages.

Thanks,
Kirti




reply via email to

[Prev in Thread] Current Thread [Next in Thread]