qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/2] pc-dimm: No numa option shouldn't break hot


From: Igor Mammedov
Subject: Re: [Qemu-devel] [PATCH 1/2] pc-dimm: No numa option shouldn't break hotplug memory feature
Date: Tue, 23 Sep 2014 10:40:47 +0200

On Mon, 22 Sep 2014 17:03:12 +0800
Tang Chen <address@hidden> wrote:

> Hi Igor,
> 
> On 09/19/2014 08:26 PM, Igor Mammedov wrote:
> > On Wed, 17 Sep 2014 16:32:20 +0800
> > Hu Tao <address@hidden> wrote:
> >
> >> On Tue, Sep 16, 2014 at 06:39:15PM +0800, zhanghailiang wrote:
> >>> If we do not configure numa option, memory hotplug should work as well.
> >>> It should not depend on numa option.
> >>>
> >>> Steps to reproduce:
> >>> (1) Start VM: qemu-kvm -m 1024,slots=4,maxmem=8G
> >>> (2) Hotplug memory
> >>> It will fail and reports:
> >>> "'DIMM property node has value 0' which exceeds the number of numa nodes: 
> >>> 0"
> >>>
> >> I rememberd Tang Chen had a patch for this bug, this is what Andrey 
> >> suggested:
> >>
> >>    I thnk that there should be no
> >>    cases when dimm is plugged (and check from patch is fired up) without
> >>    actually populated NUMA, because not every OS will workaround this by
> >>    faking the node.
> > This doesn't take in to account that dimm device by itself has nothing to do
> > with numa (numa is just optional property of its representation in ACPI land
> > and nothing else).
> >
> > In case initial memory is converted to dimm devices, qemu can be
> > started without numa option and it still must work.
> >
> > So I'm in favor of this path.
> 
> I just did some tests. Even if I modify qemu code and make hotpluggable 
> bit in SRAT 0,
> memory hotplug will still work on Linux guest, which means Linux kernel 
> doesn't check
> SRAT info after system is up when doing memory hotplug.
> 
> I did the following modification in hw/i386/acpi-build.c
> -    ram_addr_t hotplugabble_address_space_size =
> -        object_property_get_int(OBJECT(pcms), PC_MACHINE_MEMHP_REGION_SIZE,
> -                                NULL);
> +    ram_addr_t hotplugabble_address_space_size = 0;
> 
> And when the guest is up, no memory should be hotpluggable, I think. But 
> I hot-added
> memory successfully.
> 
> IMHO, I think memory hotplug should based on ACPI on Linux. And SRAT 
> tells system
> which memory ranges are hotpluggable, and we should follow it. So I 
> think Linux kernel
> has some problem in this issue.
It's fine to use SRAT for these purposes on baremetal NUMA systems since
due to used chipset constrains it's possible statically allocate ranges
for every possible DIMM socket.
However SRAT(which is optional table BTW) entries are not mandatory
and override-able by ACPI Device's _PXM/_CRS methods replacing needs
for SRAT entries and QEMU uses this fact by supplying these methods.
QEMU adds FAKE SRAT entry only to workaround Windows limitation,
and for nothing else.

I think Linux does not violate ACPI spec and behaves as expected, moreover
it's more correct than Windows since memory hotplug will work on non NUMA
machines as well.

Hence I think this patch is correct and allows memory hotplug in absence
of NUMA configuration. It also would allow to use pc-dimm as replacement
for initial memory for non-NUMA configs (which is on my TODO list)

As for the Windows, QEMU has no idea what OS it would be running,
I see 2 ways to solve issue:
 1. user should know that memory hotplug on Windows requires NUMA machine
    and specify "-numa ..." option for this case.
   (I've discussed this with libvirt folks and was promised that
    if user enables memory hotplug, libvirt would provide "-numa" option
    to workaround Windows issue)

 2. QEMU could unconditionally create single NUMA if memory hotplug is
    enabled. (but that should be enable only for 2.2 or late machines
    to avoid migration issues)

> 
> I'd like to fix it like this:
> 
> 1. Send patches to make Linux kernel to check SRAT info when doing 
> memory hotplug.
>      (Now, SRAT is only checked at boot time.)
> 
> 2. In qemu, when users gave a memory hotplug option without NUMA 
> options, we create
>      node0 and node1, and make node1 hotpluggable.
>      This is because when using MOVABLE_NODE, node0 in which the kernel 
> resides in should
>      not be hotpluggable.
>      Of course, make part of memory in node0 hotpluggable is OK, but on 
> a real machine, no
>      one will do this, I think. So I suggest above idea.
> 
> Thanks. :)
> 
> >
> >> https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg04587.html
> >>
> >> Have you tested this patch with Windows guest?
> >>
> >> Regards,
> >> Hu
> >
> > .
> >
> 
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]