qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH V17 00/11] Add support for binding guest numa no


From: Paolo Bonzini
Subject: Re: [Qemu-devel] [PATCH V17 00/11] Add support for binding guest numa nodes to host numa nodes
Date: Fri, 06 Dec 2013 10:06:16 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130923 Thunderbird/17.0.9

Il 04/12/2013 08:58, Wanlong Gao ha scritto:
> As you know, QEMU can't direct it's memory allocation now, this may cause
> guest cross node access performance regression.
> And, the worse thing is that if PCI-passthrough is used,
> direct-attached-device uses DMA transfer between device and qemu process.
> All pages of the guest will be pinned by get_user_pages().
> 
> KVM_ASSIGN_PCI_DEVICE ioctl
>   kvm_vm_ioctl_assign_device()
>     =>kvm_assign_device()
>       => kvm_iommu_map_memslots()
>         => kvm_iommu_map_pages()
>            => kvm_pin_pages()
> 
> So, with direct-attached-device, all guest page's page count will be +1 and
> any page migration will not work. AutoNUMA won't too.
> 
> So, we should set the guest nodes memory allocation policy before
> the pages are really mapped.
> 
> According to this patch set, we are able to set guest nodes memory policy
> like following:
> 
>  -numa node,nodeid=0,cpus=0, \
>  -numa mem,size=1024M,policy=membind,host-nodes=0-1 \
>  -numa node,nodeid=1,cpus=1 \
>  -numa mem,size=1024M,policy=interleave,host-nodes=1
> 
> This supports 
> "policy={default|membind|interleave|preferred},relative=true,host-nodes=N-N" 
> like format.
> 
> And add a QMP command "query-numa" to show numa info through
> this API.
> 
> And convert the "info numa" monitor command to use this
> QMP command "query-numa".
> 
> This version removes "set-mem-policy" qmp and hmp commands temporarily
> as Marcelo and Paolo suggested.
> 
> 
> The simple test is like following:
> =====================================================
> Before:
> # numactl -H && /qemu/x86_64-softmmu/qemu-system-x86_64 -m 4096  -smp 2 -numa 
> node,nodeid=0,cpus=0,mem=2048 -numa node,nodeid=1,cpus=1,mem=2048 -hda 
> 6u4ga2.qcow2 -enable-kvm -device 
> pci-assign,host=07:00.1,id=hostdev0,bus=pci.0,addr=0x7 & sleep 40 && numactl 
> -H
> [1] 13320
> available: 2 nodes (0-1)
> node 0 cpus: 0 2
> node 0 size: 5111 MB
> node 0 free: 4653 MB
> node 1 cpus: 1 3
> node 1 size: 5120 MB
> node 1 free: 4764 MB
> node distances:
> node   0   1 
>   0:  10  20 
>   1:  20  10 
> available: 2 nodes (0-1)
> node 0 cpus: 0 2
> node 0 size: 5111 MB
> node 0 free: 4317 MB
> node 1 cpus: 1 3
> node 1 size: 5120 MB
> node 1 free: 876 MB
> node distances:
> node   0   1 
>   0:  10  20 
>   1:  20  10 
> 
> 
> 
> After:
> # numactl -H && /qemu/x86_64-softmmu/qemu-system-x86_64 -m 4096 -smp 4 -numa 
> node,nodeid=0,cpus=0,cpus=2 -numa mem,size=2048M,policy=membind,host-nodes=0 
> -numa node,nodeid=0,cpus=1,cpus=3 -numa 
> mem,size=2048M,policy=membind,host-nodes=1 -hda 6u4ga2.qcow2 -enable-kvm 
> -device pci-assign,host=07:00.1,id=hostdev0,bus=pci.0,addr=0x7 & sleep 40 && 
> numactl -H
> [1] 10862
> available: 2 nodes (0-1)
> node 0 cpus: 0 2
> node 0 size: 5111 MB
> node 0 free: 4718 MB
> node 1 cpus: 1 3
> node 1 size: 5120 MB
> node 1 free: 4799 MB
> node distances:
> node   0   1 
>   0:  10  20 
>   1:  20  10 
> available: 2 nodes (0-1)
> node 0 cpus: 0 2
> node 0 size: 5111 MB
> node 0 free: 2544 MB
> node 1 cpus: 1 3
> node 1 size: 5120 MB
> node 1 free: 2725 MB
> node distances:
> node   0   1 
>   0:  10  20 
>   1:  20  10 
> ===================================================
> 
> 
> V1->V2:
>     change to use QemuOpts in numa options (Paolo)
>     handle Error in mpol parser (Paolo)
>     change qmp command format to mem-policy=membind,mem-hostnode=0-1 like 
> (Paolo)
> V2->V3:
>     also handle Error in cpus parser (5/10)
>     split out common parser from cpus and hostnode parser (Bandan 6/10)
> V3-V4:
>     rebase to request for comments
> V4->V5:
>     use OptVisitor and split -numa option (Paolo)
>      - s/set-mpol/set-mem-policy (Andreas)
>      - s/mem-policy/policy
>      - s/mem-hostnode/host-nodes
>     fix hmp command process after error (Luiz)
>     add qmp command query-numa and convert info numa to it (Luiz)
> V5->V6:
>     remove tabs in json file (Laszlo, Paolo)
>     add back "-numa node,mem=xxx" as legacy (Paolo)
>     change cpus and host-nodes to array (Laszlo, Eric)
>     change "nodeid" to "uint16"
>     add NumaMemPolicy enum type (Eric)
>     rebased on Laszlo's "OptsVisitor: support / flatten integer ranges for 
> repeating options" patch set, thanks for Laszlo's help
> V6-V7:
>     change UInt16 to uint16 (Laszlo)
>     fix a typo in adding qmp command set-mem-policy
> V7-V8:
>     rebase to current master with Laszlo's V2 of OptsVisitor patch set
>     fix an adding white space line error
> V8->V9:
>     rebase to current master
>     check if total numa memory size is equal to ram_size (Paolo)
>     add comments to the OptsVisitor stuff in qapi-schema.json (Eric, Laszlo)
>     replace the use of numa_num_configured_nodes() (Andrew)
>     avoid abusing the fact i==nodeid (Andrew)
> V9->V10:
>     rebase to current master
>     remove libnuma (Andrew)
>     MAX_NODES=64 -> MAX_NODES=128 since libnuma selected 128 (Andrew)
>     use MAX_NODES instead of MAX_CPUMASK_BITS for host_mem bitmap (Andrew)
>     remove a useless clear_bit() operation (Andrew)
> V10->V11:
>     rebase to current master
>     fix "maxnode" argument of mbind(2)
> V11->V12:
>     rebase to current master
>     split patch 02/11 of V11 (Eduardo)
>     add some max value check (Eduardo)
>     split MAX_NODES change patch (Eduardo)
> V12->V13:
>     rebase to current master
>     thanks for Luiz's review (Luiz)
>     doc hmp command set-mem-policy (Luiz)
>     rename: NUMAInfo -> NUMANode (Luiz)
> V13->V14:
>     remove "set-mem-policy" qmp and hmp commands (Marcelo, Paolo)
> V14->V15:
>     rebase to the current master
> V15->V16:
>     rebase to current master
>     add more test log
> V16->V17:
>     use MemoryRegion to set policy instead of using "pc.ram" (Paolo)
> 
> Wanlong Gao (11):
>   NUMA: move numa related code to new file numa.c
>   NUMA: check if the total numa memory size is equal to ram_size
>   NUMA: Add numa_info structure to contain numa nodes info
>   NUMA: convert -numa option to use OptsVisitor
>   NUMA: introduce NumaMemOptions
>   NUMA: add "-numa mem," options
>   NUMA: expand MAX_NODES from 64 to 128
>   NUMA: parse guest numa nodes memory policy
>   NUMA: set guest numa nodes memory policy
>   NUMA: add qmp command query-numa
>   NUMA: convert hmp command info_numa to use qmp command query_numa
> 
>  Makefile.target         |   2 +-
>  cpus.c                  |  14 --
>  hmp.c                   |  57 +++++++
>  hmp.h                   |   1 +
>  hw/i386/pc.c            |  21 ++-
>  include/exec/memory.h   |  15 ++
>  include/sysemu/cpus.h   |   1 -
>  include/sysemu/sysemu.h |  18 ++-
>  monitor.c               |  21 +--
>  numa.c                  | 408 
> ++++++++++++++++++++++++++++++++++++++++++++++++
>  qapi-schema.json        | 112 +++++++++++++
>  qemu-options.hx         |   6 +-
>  qmp-commands.hx         |  49 ++++++
>  vl.c                    | 160 +++----------------
>  14 files changed, 698 insertions(+), 187 deletions(-)
>  create mode 100644 numa.c
> 

I think patches 1-4 and 7 are fine.  For the rest, I'd rather wait for
Igor's patches and try to integrate with Igor's memory hotplug patches.

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]