qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] migration/rdma: Use huge page register VM memory


From: Dr. David Alan Gilbert
Subject: Re: [PATCH] migration/rdma: Use huge page register VM memory
Date: Mon, 7 Jun 2021 16:00:28 +0100
User-agent: Mutt/2.0.7 (2021-05-04)

* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Mon, Jun 07, 2021 at 01:57:02PM +0000, LIZHAOXIN1 [李照鑫] wrote:
> > When using libvirt for RDMA live migration, if the VM memory is too large,
> > it will take a lot of time to deregister the VM at the source side, 
> > resulting
> > in a long downtime (VM 64G, deregister vm time is about 400ms).
> >     
> > Although the VM's memory uses 2M huge pages, the MLNX driver still uses 4K
> > pages for pin memory, as well as for unpin. So we use huge pages to skip the
> > process of pin memory and unpin memory to reduce downtime.
> >    
> > The test environment:
> > kernel: linux-5.12
> > MLNX: ConnectX-4 LX
> > libvirt command:
> > virsh migrate --live --p2p --persistent --copy-storage-inc --listen-address 
> > \
> > 0.0.0.0 --rdma-pin-all --migrateuri rdma://192.168.0.2 [VM] 
> > qemu+tcp://192.168.0.2/system
> >     
> > Signed-off-by: lizhaoxin <lizhaoxin1@kingsoft.com>
> > 
> > diff --git a/migration/rdma.c b/migration/rdma.c
> > index 1cdb4561f3..9823449297 100644
> > --- a/migration/rdma.c
> > +++ b/migration/rdma.c
> > @@ -1123,13 +1123,26 @@ static int 
> > qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
> >      RDMALocalBlocks *local = &rdma->local_ram_blocks;
> >  
> >      for (i = 0; i < local->nb_blocks; i++) {
> > -        local->block[i].mr =
> > -            ibv_reg_mr(rdma->pd,
> > -                    local->block[i].local_host_addr,
> > -                    local->block[i].length,
> > -                    IBV_ACCESS_LOCAL_WRITE |
> > -                    IBV_ACCESS_REMOTE_WRITE
> > -                    );
> > +        if (strcmp(local->block[i].block_name,"pc.ram") == 0) {
> 
> 'pc.ram' is an x86 architecture specific name, so this will still
> leave a problem on other architectures I assume.

Yes, and also break even on PC when using NUMA.
I think the thing to do here is to call qemu_ram_pagesize on the
RAMBlock; 

  if (qemu_ram_pagesize(RAMBlock....) != qemu_real_host_page_size)
     it's a huge page

I guess it's probably best to do that in qemu_rdma_init_one_block or
something?

I wonder how that all works when there's a mix of different huge page
sizes?

Dave

> > +            local->block[i].mr =
> > +                ibv_reg_mr(rdma->pd,
> > +                        local->block[i].local_host_addr,
> > +                        local->block[i].length,
> > +                        IBV_ACCESS_LOCAL_WRITE |
> > +                        IBV_ACCESS_REMOTE_WRITE |
> > +                        IBV_ACCESS_ON_DEMAND |
> > +                        IBV_ACCESS_HUGETLB
> > +                        );
> > +        } else {
> > +            local->block[i].mr =
> > +                ibv_reg_mr(rdma->pd,
> > +                        local->block[i].local_host_addr,
> > +                        local->block[i].length,
> > +                        IBV_ACCESS_LOCAL_WRITE |
> > +                        IBV_ACCESS_REMOTE_WRITE
> > +                        );
> > +        }
> > +
> >          if (!local->block[i].mr) {
> >              perror("Failed to register local dest ram block!\n");
> >              break;
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-- 
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK




reply via email to

[Prev in Thread] Current Thread [Next in Thread]