qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [Qemu-devel] [qemu-s390x] [PATCH for-4.0?] exec: Only cou


From: David Hildenbrand
Subject: Re: [Qemu-ppc] [Qemu-devel] [qemu-s390x] [PATCH for-4.0?] exec: Only count mapped memory backends for qemu_getrampagesize()
Date: Wed, 27 Mar 2019 14:19:41 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

On 27.03.19 10:03, David Gibson wrote:
> On Wed, Mar 27, 2019 at 09:10:01AM +0100, David Hildenbrand wrote:
>> On 27.03.19 01:12, David Gibson wrote:
>>> On Tue, Mar 26, 2019 at 06:02:51PM +0100, David Hildenbrand wrote:
>>>> On 26.03.19 15:08, Igor Mammedov wrote:
>>>>> On Tue, 26 Mar 2019 14:50:58 +1100
>>>>> David Gibson <address@hidden> wrote:
>>>>>
>>>>>> qemu_getrampagesize() works out the minimum host page size backing any of
>>>>>> guest RAM.  This is required in a few places, such as for POWER8 PAPR KVM
>>>>>> guests, because limitations of the hardware virtualization mean the guest
>>>>>> can't use pagesizes larger than the host pages backing its memory.
>>>>>>
>>>>>> However, it currently checks against *every* memory backend, whether or 
>>>>>> not
>>>>>> it is actually mapped into guest memory at the moment.  This is 
>>>>>> incorrect.
>>>>>>
>>>>>> This can cause a problem attempting to add memory to a POWER8 pseries KVM
>>>>>> guest which is configured to allow hugepages in the guest (e.g.
>>>>>> -machine cap-hpt-max-page-size=16m).  If you attempt to add non-hugepage,
>>>>>> you can (correctly) create a memory backend, however it (correctly) will
>>>>>> throw an error when you attempt to map that memory into the guest by
>>>>>> 'device_add'ing a pc-dimm.
>>>>>>
>>>>>> What's not correct is that if you then reset the guest a startup check
>>>>>> against qemu_getrampagesize() will cause a fatal error because of the new
>>>>>> memory object, even though it's not mapped into the guest.
>>>>> I'd say that backend should be remove by mgmt app since device_add failed
>>>>> instead of leaving it to hang around. (but fatal error either not a nice
>>>>> behavior on QEMU part)
>>>>
>>>> Indeed, it should be removed. Depending on the options (huge pages with
>>>> prealloc?) memory might be consumed for unused memory. Undesired.
>>>
>>> Right, but if the guest initiates a reboot before the management gets
>>> to that, we'll have a crash.
>>>
>>
>> Yes, I agree.
>>
>> At least on s390x (extending on what Igor said):
>>
>> mc->init() -> s390_memory_init() ->
>> memory_region_allocate_system_memory() -> host_memory_backend_set_mapped()
>>
>>
>> ac->init_machine() -> kvm_arch_init() ->
>> kvm_s390_configure_mempath_backing() -> qemu_getrampagesize()
>>
>>
>> And in vl.c
>>
>> configure_accelerator(current_machine, argv[0]);
>> ...
>> machine_run_board_init()
>>
>> So memory is indeed not mapped before calling qemu_getrampagesize().
>>
>>
>> We *could* move the call to kvm_s390_configure_mempath_backing() to
>> s390_memory_init().
>>
>> cap_hpage_1m is not needed before we create VCPUs, so this would work fine.
>>
>> We could than eventually make qemu_getrampagesize() asssert if no
>> backends are mapped at all, to catch other user that rely on this being
>> correct.
> 
> So.. I had a look at the usage in kvm_s390_configure_mempath_backing()
> and I'm pretty sure it's broken.  It will work in the case where
> there's only one backend.  And if that's the default -mem-path rather
> than an explicit memory backend then my patch won't break it any
> further.

It works for the current scenarios, where you only have one (maximum
two) backings of the same kind. Your patch would break that.

> 
> qemu_getrampagesize() returns the smallest host page size for any
> memory backend.  That's what matters for pcc KVM (in several cases)
> because we need certain things to be host-contiguous, not just
> guest-contiguous.  Bigger host page sizes are fine for that purpose,
> clearly.
> 
> AFAICT on s390 you're looking to determine if any backend is using
> hugepages, because KVM may not support that.  The minimum host page
> size isn't adequate to determine that, so qemu_getrampagesize() won't
> tell you what you need.

Well, as long as we don't support DIMMS or anything like that it works
perfectly fine. But yes it is far from beautiful.

First of all, I'll prepare a patch to do the call from a different
context. Then we can fine tune to using something else than
qemu_getrampagesize()

-- 

Thanks,

David / dhildenb



reply via email to

[Prev in Thread] Current Thread [Next in Thread]