Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscard

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscard

From:	David Hildenbrand
Subject:	Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
Date:	Tue, 27 Jul 2021 11:25:09 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0

On 24.07.21 00:19, Peter Xu wrote:

On Fri, Jul 23, 2021 at 08:41:40PM +0200, David Hildenbrand wrote:

On 23.07.21 18:12, Peter Xu wrote:

On Thu, Jul 22, 2021 at 01:43:41PM +0200, David Hildenbrand wrote:

a) In precopy code, always clearing all dirty bits from the bitmap that
      correspond to discarded range, whenever we update the dirty bitmap. This
      results in logically unplugged memory to never get migrated.


Have you seen cases where discarded areas are being marked as dirty?
That suggests something somewhere is writing to them and shouldn't be.


I have due to sub-optimal clear_bmap handling to be sorted out by

20210722083055.23352-1-wei.w.wang@intel.com">https://lkml.kernel.org/r/20210722083055.23352-1-wei.w.wang@intel.com

Whereby the issue is rather that initially dirty bits don't get cleared in
lower layers and keep popping up as dirty.

The issue with postcopy recovery code setting discarded ranges dirty in
the dirty bitmap, I did not try reproducing. But from looking at the
code, it's pretty clear that it would happen.

Apart from that, nothing should dirty that memory. Of course,
malicious guests could trigger it for now, in which case we wouldn't catch it
and migrate such pages with postcopy, because the final bitmap sync in
ram_postcopy_send_discard_bitmap() is performed without calling notifiers
right now.


I have the same concern with Dave: does it mean that we don't need to touch at
least ramblock_sync_dirty_bitmap in patch 3?


Yes, see the comment in patch #3:

"
Note: If discarded ranges span complete clear_bmap chunks, we'll never
clear the corresponding bits from clear_bmap and consequently never call
memory_region_clear_dirty_bitmap on the affected regions. While this is
perfectly fine, we're still synchronizing the bitmap of discarded ranges,
for example, in
ramblock_sync_dirty_bitmap()->cpu_physical_memory_sync_dirty_bitmap()
but also during memory_global_dirty_log_sync().

In the future, it might make sense to never even synchronize the dirty log
of these ranges, for example in KVM code, skipping discarded ranges
completely.
"

The KVM path might be even more interesting (with !dirty ring IIRC).

So that might certainly be worth looking into if we find it to be a real
performance problem.


OK; hmm then I feel like what's missing is we didn't have the dirty bmap and
the clear map synced - say, what if we do memory_region_clear_dirty_bitmap()
when dropping the virtio-mem unplugged ranges too?

Is it a problem that we leave clear_bmap set and actually never clearsome ranges? I don't think so. To me, this feels like the right thing todo: no need to clear something (in QEMU, in KVM) nobody cares about.

IMHO, the real optimization should be to not even sync discarded ranges(not from the accelerator, not from the memory region), skipping theseranges completely (no sync, no clear). With what you propose, we mightend up calling into KVM to clear bitmaps of ranges we are not interestedin, no?


If disgarded ranges are static during migration, the clear dirty log should
happen once for them at bitmap init time.  Then IIUC when sync we don't need to
worry about unplugged memory anymore.


Again, I'm not sure why we want to clear something we don't care about.


There are 3 cases to handle I think:

1) Initially, when the bitmap is set to 1, we want to exclude alldiscarded ranges.

2) Whenever we sync the bitmap, we don't want to get discarded rangesset dirty. (e.g., bits still or again dirty in KVM or the memory region)


3) When reloading the bitmap during postcopy errors.

I think for 1) and 3) we seem to agree that clearing the discardedranges from the dirty bitmap is conceptually the right thing.



For 2) I see 3 options:

a) Sync everything, fixup the dirty bitmap, never clear the dirty log ofdiscarded parts. It's fairly simple and straight forward, as I cansimply reuse the existing helper. Something that's discarded will neverbe dirty, not even if a misbehaving guest touches memory it shouldn't.[this patch]

b) Sync only populated parts, no need to fixup the dirty bitmap, neverclear the dirty log of discarded parts. It's a bit more complicated butachieves the same goal as a). [optimization I propose for the future]

c) Sync everything, don't fixup the dirty bitmap, clear the dirty log ofdiscarded parts initially. There are ways we still might migratediscarded ranges, for example, if a misbehaving guest touches memory itshouldn't. [what you propose]

Is my understanding correct? Any reasons why we should chose c) over b)long term or c) over a) short term?


Thanks!

--
Thanks,

David / dhildenb

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [PATCH v2 5/6] migration/postcopy: Handle RAMBlocks with a RamDiscardManager on the destination, (continued)
- [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots, David Hildenbrand, 2021/07/21
  - Re: [PATCH v2 6/6] migration/ram: Handle RAMBlocks with a RamDiscardManager on background snapshots, Peter Xu, 2021/07/23
- Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Dr. David Alan Gilbert, 2021/07/22
  - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/22
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/23
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/23
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/23
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand <=
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/27
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/28
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/28
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/28
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/28
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/29
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/29
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/29
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, Peter Xu, 2021/07/29
    - Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager, David Hildenbrand, 2021/07/29

Prev by Date: Re: QEMU question: upstreaming I2C device with unpublished datasheet
Next by Date: Re: aarch64 efi boot failures with qemu 6.0+
Previous by thread: Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
Next by thread: Re: [PATCH v2 0/6] migration/ram: Optimize for virtio-mem via RamDiscardManager
Index(es):
- Date
- Thread