qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC 2/5] s390x: implement diag260


From: David Hildenbrand
Subject: Re: [PATCH RFC 2/5] s390x: implement diag260
Date: Wed, 15 Jul 2020 11:42:37 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0

On 13.07.20 13:08, Christian Borntraeger wrote:
> On 13.07.20 12:27, David Hildenbrand wrote:
>>
>>
>>> Am 13.07.2020 um 11:12 schrieb Heiko Carstens <hca@linux.ibm.com>:
>>>
>>> On Fri, Jul 10, 2020 at 05:24:07PM +0200, David Hildenbrand wrote:
>>>>> On 10.07.20 17:18, Heiko Carstens wrote:
>>>>> On Fri, Jul 10, 2020 at 02:12:33PM +0200, David Hildenbrand wrote:
>>>>>>> Note: Reading about diag260 subcode 0xc, we could modify Linux to query
>>>>>>> the maximum possible pfn via diag260 0xc. Then, we maybe could avoid
>>>>>>> indicating maxram size via SCLP, and keep diag260-unaware OSs keep
>>>>>>> working as before. Thoughts?
>>>>>>
>>>>>> Implemented it, seems to work fine.
>>>>>
>>>>> The returned value would not include standby/reserved memory within
>>>>> z/VM. So this seems not to work.
>>>>
>>>> Which value exactly are you referencing? diag 0xc returns two values.
>>>> One of them seems to do exactly what we need.
>>>>
>>>> See
>>>> https://github.com/davidhildenbrand/linux/commit/a235f9fb20df7c04ae89bc0d134332d1a01842c7
>>>>
>>>> for my current Linux approach.
>>>>
>>>>> Also: why do you want to change this
>>>>
>>>> Which change exactly do you mean?
>>>>
>>>> If we limit the value returned via SCLP to initial memory, we cannot
>>>> break any guest (e.g., Linux pre 4.2, kvm-unit-tests). diag260 is then
>>>> purely optional.
>>>
>>> Ok, now I see the context. Christian added my just to cc on this
>>> specific patch.
>>
>> I tried to Cc you an all patches but the mail bounced with unknown address 
>> (maybe I messed up).
>>
>>> So if I understand you correctly, then you want to use diag 260 in
>>> order to figure out how much memory is _potentially_ available for a
>>> guest?
>>
>> Yes, exactly.
>>
>>>
>>> This does not fit to the current semantics, since diag 260 returns the
>>> address of the highest *currently* accessible address. That is: it
>>> does explicitly *not* include standby memory or anything else that
>>> might potentially be there.
>>
>> The confusing part is that it talks about „adressible“ and not „accessible“. 
>> Now that I understood the „DEFINE STORAGE ...“ example, it makes sense that 
>> the values change with reserved/standby memory.
>>
>> I agree that reusing that interface might not be what we want. I just seemed 
>> too easy to avoid creating something new :)
>>
>>>
>>> So you would need a different interface to tell the guest about your
>>> new hotplug memory interface. If sclp does not work, then maybe a new
>>> diagnose(?).
>>>
>>
>> Yes, I think a new Diagnose makes sense. I‘ll have a look next week to 
>> figure out which codes/subcodes we could use. @Christian @Conny any 
>> ideas/pointers?> 
> 
> Wouldnt sclp be the right thing to provide the max increment number? (and 
> thus the max memory address)
> And then (when I got the discussion right) use diag 260 to get the _current_ 
> value.

So, in summary, we want to indicate to the guest a memory region that
will be used to place memory devices ("device memory region"). The
region might have holes and the memory within this region might have
different semantics than ordinary system memory. Memory that belongs to
memory devices should only be detected+used if the guest OS has support
for them (e.g., virtio-mem, virtio-pmem, ...). An unmodified guest
(e.g., no virtio-mem driver) should not accidentally make use of such
memory.

We need a way to
a) Tell the guest about boot memory (currently ram_size)
b) Tell the guest about the maximum possible ram address, including
device memory. (We could also indicate the special "device memory
region" explicitly)


AFAIK, we have three options:


1. Indicate maxram_size via SCLP, indicate ram_size via diag260(0x10)

This is what this series (RFCv1 does).

Advantages:
- No need for a new diag. No need for memory sensing kernel changes.
Disadvantages
- Older guests without support for diag260 (<v4.2, kvm-unit-tests) will
  assume all memory is accessible. Bad.
- The semantics of the value returned in ry via diag260(0xc) is somewhat
  unclear. Should we return the end address of the highest memory
  device? OTOH, an unmodified guest OS (without support for memory
  devices) should not have to care at all about any such memory.
- If we ever want to also support standby memory, we might be in
  trouble. (see below)

2. Indicate ram_size via SCLP, indicate device memory region
   (currently maxram_size) via new DIAG

Advantages:
- Unmodified guests won't use/sense memory belonging to memory devices.
- We can later have standby memory + memory devices co-exist.
Disadvantages
- Need a new DIAG.

3. Indicate maxram_size and ram_size via SCLP (using the SCLP standby
   memory)

I did not look into the details, because -ENODOCUMENTATION. At least we
would run into some alignment issues (again, having to align
ram_size/maxram_size to storage increments - which would no longer be
1MB). We would run into issues later, trying to also support standby memory.



I guess 1) would mostly work, one just has to run a suitable guest
inside the VM. This is no different to running under z/VM where querying
diag260 is required. The nice thing about 2) would be, that we can
easily implement standby memory. Something like:

-m 2G,maxram_size=20G,standbyram_size=4G

[ 2G boot RAM ][ 4G standby RAM ][ 14G device memory ]
                                 ^ via SCLP maximum increment
                                                     ^ via new DIAG

-- 
Thanks,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]