[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH RFC 11/21] migration: Add hugetlb-doublemap cap
From: |
Peter Xu |
Subject: |
Re: [PATCH RFC 11/21] migration: Add hugetlb-doublemap cap |
Date: |
Tue, 24 Jan 2023 16:15:37 -0500 |
On Tue, Jan 24, 2023 at 12:45:38PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > Add a new cap to allow mapping hugetlbfs backed RAMs in small page sizes.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
>
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Thanks.
>
> although, I'm curious if the protocol actually changes
Yes it does.
It differs not in the form of a changed header or any frame definitions,
but in the format of how huge pages are sent. The old binary can only send
a huge page by sending all the small pages sequentially starting from index
0 to index N_HUGE-1; while the new binary can send the huge page out of
order. For the latter it's the same as when huge page is not used.
> or whether a doublepage enabled destination would work with an unmodified
> source?
This is an interesting question.
I would expect old -> new work as usual, because the page frames are not
modified so the dest node will just see pages being migrated in a
sequential manner. The latency of page request will be the same as old
binary though because even if dest host can handle small pages it won't be
able to get asap on the pages it wants - src host decides which page to
send.
Meanwhile new -> old shouldn't work I think as described above, because the
dest host should see weird things happening, e.g., a huge page was sent not
starting fron index 0 but index X (0<X<N_HUGE-1). It should quickly bail
out assuming there's something wrong.
> I guess potentially you can get away without the dirty clearing
> of the partially sent hugepages that the source normally does.
Good point. It's actually more relevant to the other patch later on
reworking the discard logic. I kept it as-is for majorly two reasons:
1) It is still not 100% confirmed on how MADV_DONTNEED should behave on
HGM enabled memory ranges where huge pages used to be mapped. It's
part of the discussion upstream on the kernel patchset. I think it's
settling, but in the current series I kept it in a form so it'll work
in all cases.
2) Not dirtying the partially sent huge pages can always reduce small
pages being migrated, but it can also change the content of discard
messages due to the frame format of MIG_CMD_POSTCOPY_RAM_DISCARD, in
that we can have a lot more scattered ranges, so a lot more messaging
can be needed. While when with the existing logic, since we'll always
re-dirty the partial sent pages, the ranges are more likely to be
efficient.
* CMD_POSTCOPY_RAM_DISCARD consist of:
* byte version (0)
* byte Length of name field (not including 0)
* n x byte RAM block name
* byte 0 terminator (just for safety)
* n x Byte ranges within the named RAMBlock
* be64 Start of the range
* be64 Length
I think 1) may not hold as the kernel series evolves, so it may not be true
anymore. 2) may still be true, but I think worth some testing (especially
on 1G pages) to see how it could interfere the discard procedure. Maybe it
won't be as bad as I think. Even if it could, we can evaluate the tradeoff
between "slower discard sync" and "less page need to send". E.g., we can
consider changing the frame layout by boosting postcopy_ram_discard_version.
I'll take a note on this one and provide more update in the next version.
--
Peter Xu
- Re: [PATCH RFC 10/21] ramblock: Add ramblock_file_map(), (continued)
Re: [PATCH RFC 10/21] ramblock: Add ramblock_file_map(), Juan Quintela, 2023/01/30
[PATCH RFC 09/21] ramblock: Add RAM_READONLY, Peter Xu, 2023/01/17
[PATCH RFC 11/21] migration: Add hugetlb-doublemap cap, Peter Xu, 2023/01/17
[PATCH RFC 12/21] migration: Introduce page size for-migration-only, Peter Xu, 2023/01/17
Re: [PATCH RFC 12/21] migration: Introduce page size for-migration-only, Juan Quintela, 2023/01/30
[PATCH RFC 13/21] migration: Add migration_ram_pagesize_largest(), Peter Xu, 2023/01/17
[PATCH RFC 14/21] migration: Map hugetlbfs ramblocks twice, and pre-allocate, Peter Xu, 2023/01/17