qemu-arm
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-arm] [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate device_


From: Auger Eric
Subject: Re: [Qemu-arm] [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate device_memory
Date: Thu, 5 Jul 2018 16:27:05 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0

Hi Shameer,

On 07/05/2018 03:19 PM, Shameerali Kolothum Thodi wrote:
> 
>> -----Original Message-----
>> From: Auger Eric [mailto:address@hidden
>> Sent: 05 July 2018 13:18
>> To: David Hildenbrand <address@hidden>; address@hidden;
>> address@hidden; address@hidden; address@hidden;
>> Shameerali Kolothum Thodi <address@hidden>;
>> address@hidden
>> Cc: address@hidden; address@hidden; address@hidden;
>> address@hidden; address@hidden
>> Subject: Re: [Qemu-devel] [RFC v3 06/15] hw/arm/virt: Allocate
>> device_memory
>>
>> Hi David,
>>
>> On 07/05/2018 02:09 PM, David Hildenbrand wrote:
>>> On 05.07.2018 14:00, Auger Eric wrote:
>>>> Hi David,
>>>>
>>>> On 07/05/2018 01:54 PM, David Hildenbrand wrote:
>>>>> On 05.07.2018 13:42, Auger Eric wrote:
>>>>>> Hi David,
>>>>>>
>>>>>> On 07/04/2018 02:05 PM, David Hildenbrand wrote:
>>>>>>> On 03.07.2018 21:27, Auger Eric wrote:
>>>>>>>> Hi David,
>>>>>>>> On 07/03/2018 08:25 PM, David Hildenbrand wrote:
>>>>>>>>> On 03.07.2018 09:19, Eric Auger wrote:
>>>>>>>>>> We define a new hotpluggable RAM region (aka. device memory).
>>>>>>>>>> Its base is 2TB GPA. This obviously requires 42b IPA support
>>>>>>>>>> in KVM/ARM, FW and guest kernel. At the moment the device
>>>>>>>>>> memory region is max 2TB.
>>>>>>>>>
>>>>>>>>> Maybe a stupid question, but why exactly does it have to start at 2TB
>>>>>>>>> (and not e.g. at 1TB)?
>>>>>>>> not a stupid question. See tentative answer below.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> This is largely inspired of device memory initialization in
>>>>>>>>>> pc machine code.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Eric Auger <address@hidden>
>>>>>>>>>> Signed-off-by: Kwangwoo Lee <address@hidden>
>>>>>>>>>> ---
>>>>>>>>>>  hw/arm/virt.c         | 104
>> ++++++++++++++++++++++++++++++++++++--------------
>>>>>>>>>>  include/hw/arm/arm.h  |   2 +
>>>>>>>>>>  include/hw/arm/virt.h |   1 +
>>>>>>>>>>  3 files changed, 79 insertions(+), 28 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>>>>>>>>>> index 5a4d0bf..6fefb78 100644
>>>>>>>>>> --- a/hw/arm/virt.c
>>>>>>>>>> +++ b/hw/arm/virt.c
>>>>>>>>>> @@ -59,6 +59,7 @@
>>>>>>>>>>  #include "qapi/visitor.h"
>>>>>>>>>>  #include "standard-headers/linux/input.h"
>>>>>>>>>>  #include "hw/arm/smmuv3.h"
>>>>>>>>>> +#include "hw/acpi/acpi.h"
>>>>>>>>>>
>>>>>>>>>>  #define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
>>>>>>>>>>      static void virt_##major##_##minor##_class_init(ObjectClass *oc,
>> \
>>>>>>>>>> @@ -94,34 +95,25 @@
>>>>>>>>>>
>>>>>>>>>>  #define PLATFORM_BUS_NUM_IRQS 64
>>>>>>>>>>
>>>>>>>>>> -/* RAM limit in GB. Since VIRT_MEM starts at the 1GB mark, this
>> means
>>>>>>>>>> - * RAM can go up to the 256GB mark, leaving 256GB of the physical
>>>>>>>>>> - * address space unallocated and free for future use between 256G
>> and 512G.
>>>>>>>>>> - * If we need to provide more RAM to VMs in the future then we
>> need to:
>>>>>>>>>> - *  * allocate a second bank of RAM starting at 2TB and working up
>>>>>>>> I acknowledge this comment was the main justification. Now if you look
>> at
>>>>>>>>
>>>>>>>> Principles of ARM Memory Maps
>>>>>>>>
>> http://infocenter.arm.com/help/topic/com.arm.doc.den0001c/DEN0001C_princ
>> iples_of_arm_memory_maps.pdf
>>>>>>>> chapter 2.3 you will find that when adding PA bits, you always leave
>>>>>>>> space for reserved space and mapped IO.
>>>>>>>
>>>>>>> Thanks for the pointer!
>>>>>>>
>>>>>>> So ... we can fit
>>>>>>>
>>>>>>> a) 2GB at 2GB
>>>>>>> b) 32GB at 32GB
>>>>>>> c) 512GB at 512GB
>>>>>>> d) 8TB at 8TB
>>>>>>> e) 128TB at 128TB
>>>>>>>
>>>>>>> (this is a nice rule of thumb if I understand it correctly :) )
>>>>>>>
>>>>>>> We should strive for device memory (maxram_size - ram_size) to fit
>>>>>>> exactly into one of these slots (otherwise things get nasty).
>>>>>>>
>>>>>>> Depending on the ram_size, we might have simpler setups and can
>> support
>>>>>>> more configurations, no?
>>>>>>>
>>>>>>> E.g. ram_size <= 34GB, device_memory <= 512GB
>>>>>>> -> move ram into a) and b)
>>>>>>> -> move device memory into c)
>>>>>>
>>>>>> The issue is machvirt doesn't comply with that document.
>>>>>> At the moment we have
>>>>>> 0 -> 1GB MMIO
>>>>>> 1GB -> 256GB RAM
>>>>>> 256GB -> 512GB is theoretically reserved for IO but most is free.
>>>>>> 512GB -> 1T is reserved for ECAM MMIO range. This is the top of our
>>>>>> existing 40b GPA space.
>>>>>>
>>>>>> We don't want to change this address map due to legacy reasons.
>>>>>>
>>>>>
>>>>> Thanks, good to know!
>>>>>
>>>>>> Another question! do you know if it would be possible to have
>>>>>> device_memory region split into several discontinuous segments?
>>>>>
>>>>> It can be implemented for sure, but I would try to avoid that, as it
>>>>> makes certain configurations impossible (and very end user unfriendly).
>>>>>
>>>>> E.g. (numbers completely made up, but it should show what I mean)
>>>>>
>>>>> -m 20G,maxmem=120G:
>>>>> -> Try to add a DIMM with 100G -> error.
>>>>> -> But we can add e.g. two DIMMs with 40G and 60G.
>>>>>
>>>>> This exposes internal details to the end user. And the end user has no
>>>>> idea what is going on.
>>>>>
>>>>> So I think we should try our best to keep that area consecutive.
>>>>
>>>> Actually I didn't sufficiently detail the context. I would like
>>>> 1) 1 segment to be exposed to the end-user through slot|maxmem stuff
>>>> (what this series targets) and
>>>> 2) another segment used to instantiate PC-DIMM for internal needs as
>>>> replacement of part of the 1GB -> 256GB static RAM. This was the purpose
>>>> of Shameer's original series
>>>
>>> I am not sure if PC-DIMMs are exactly what you want for internal purposes.
>>>
>>>>
>>>> [1] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
>>>> http://patchwork.ozlabs.org/cover/914694/
>>>> This approach is not yet validated though.
>>>>
>>>> The rationale is sometimes you must have "holes" in RAM as some GPAs
>>>> match reserved IOVAs for assigned devices.
>>>
>>> So if I understand it correctly, all you want is some memory region that
>>> a) contains only initially defined memory
>>> b) can have some holes in it
>>>
>>> This is exactly what x86 already does (pc_memory_init): Simply construct
>>> your own memory region leaving holes in it.
>>>
>>>
>>> memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
>>>                          0, pcms->below_4g_mem_size);
>>> memory_region_add_subregion(system_memory, 0, ram_below_4g);
>>> ...
>>> if (pcms->above_4g_mem_size > 0)
>>>     memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
>>>     ...
>>>     memory_region_add_subregion(system_memory, 0x100000000ULL,
>>>     ...
>>>
>>> They "indicate" these different GPA areas using the e820 map to the guest.
>>>
>>> Would that also work for you?
>>
>> I would tentatively say yes. Effectively I am not sure that if we were
>> to actually put holes in the 1G-256GB RAM segment, PC-DIMM would be the
>> natural choice. Also the reserved IOVA issue impacts the device_memory
>> region area I think. I am skeptical about the fact we can put holes in
>> static RAM and device_memory regions like that.
> 
> The first approach[1] we had to address the holes in memory was using
> the memory alias way mentioned above.  And based on Drew's review, the
> pc-dimm way of handling was introduced. I think the main argument was that
> it will be useful when we eventually support hotplug.

That's my understanding too.

 But since that is added
> anyway as part of this series, I am not sure we have any other benefit in
> modeling it as pc-dimm. May be I am missing something here.

I tentatively agree with you. I was trying to understand if the
device_memory region was fitting the original needs too but I think
standard alias approach is more adapted to hole creation.

Thanks

Eric
> 
> Thanks,
> Shameer
> 
> [1]. https://lists.gnu.org/archive/html/qemu-arm/2018-04/msg00243.html
> 
> 
>> Thanks!
>>
>> Eric
>>>
>>>>
>>>> Thanks
>>>>
>>>> Eric
>>>>
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Eric
>>>>>
>>>>>
>>>
>>>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]