[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] migration/rdma: Use huge page register VM memory
From: |
Dr. David Alan Gilbert |
Subject: |
Re: [PATCH] migration/rdma: Use huge page register VM memory |
Date: |
Mon, 7 Jun 2021 16:00:28 +0100 |
User-agent: |
Mutt/2.0.7 (2021-05-04) |
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Jun 07, 2021 at 01:57:02PM +0000, LIZHAOXIN1 [李照鑫] wrote:
> > When using libvirt for RDMA live migration, if the VM memory is too large,
> > it will take a lot of time to deregister the VM at the source side,
> > resulting
> > in a long downtime (VM 64G, deregister vm time is about 400ms).
> >
> > Although the VM's memory uses 2M huge pages, the MLNX driver still uses 4K
> > pages for pin memory, as well as for unpin. So we use huge pages to skip the
> > process of pin memory and unpin memory to reduce downtime.
> >
> > The test environment:
> > kernel: linux-5.12
> > MLNX: ConnectX-4 LX
> > libvirt command:
> > virsh migrate --live --p2p --persistent --copy-storage-inc --listen-address
> > \
> > 0.0.0.0 --rdma-pin-all --migrateuri rdma://192.168.0.2 [VM]
> > qemu+tcp://192.168.0.2/system
> >
> > Signed-off-by: lizhaoxin <lizhaoxin1@kingsoft.com>
> >
> > diff --git a/migration/rdma.c b/migration/rdma.c
> > index 1cdb4561f3..9823449297 100644
> > --- a/migration/rdma.c
> > +++ b/migration/rdma.c
> > @@ -1123,13 +1123,26 @@ static int
> > qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
> > RDMALocalBlocks *local = &rdma->local_ram_blocks;
> >
> > for (i = 0; i < local->nb_blocks; i++) {
> > - local->block[i].mr =
> > - ibv_reg_mr(rdma->pd,
> > - local->block[i].local_host_addr,
> > - local->block[i].length,
> > - IBV_ACCESS_LOCAL_WRITE |
> > - IBV_ACCESS_REMOTE_WRITE
> > - );
> > + if (strcmp(local->block[i].block_name,"pc.ram") == 0) {
>
> 'pc.ram' is an x86 architecture specific name, so this will still
> leave a problem on other architectures I assume.
Yes, and also break even on PC when using NUMA.
I think the thing to do here is to call qemu_ram_pagesize on the
RAMBlock;
if (qemu_ram_pagesize(RAMBlock....) != qemu_real_host_page_size)
it's a huge page
I guess it's probably best to do that in qemu_rdma_init_one_block or
something?
I wonder how that all works when there's a mix of different huge page
sizes?
Dave
> > + local->block[i].mr =
> > + ibv_reg_mr(rdma->pd,
> > + local->block[i].local_host_addr,
> > + local->block[i].length,
> > + IBV_ACCESS_LOCAL_WRITE |
> > + IBV_ACCESS_REMOTE_WRITE |
> > + IBV_ACCESS_ON_DEMAND |
> > + IBV_ACCESS_HUGETLB
> > + );
> > + } else {
> > + local->block[i].mr =
> > + ibv_reg_mr(rdma->pd,
> > + local->block[i].local_host_addr,
> > + local->block[i].length,
> > + IBV_ACCESS_LOCAL_WRITE |
> > + IBV_ACCESS_REMOTE_WRITE
> > + );
> > + }
> > +
> > if (!local->block[i].mr) {
> > perror("Failed to register local dest ram block!\n");
> > break;
>
> Regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK