qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [qemu-s390x] [PATCH for-4.0?] exec: Only count mapped m


From: David Hildenbrand
Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH for-4.0?] exec: Only count mapped memory backends for qemu_getrampagesize()
Date: Wed, 27 Mar 2019 15:22:58 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0

On 27.03.19 10:03, David Gibson wrote:
> On Wed, Mar 27, 2019 at 09:10:01AM +0100, David Hildenbrand wrote:
>> On 27.03.19 01:12, David Gibson wrote:
>>> On Tue, Mar 26, 2019 at 06:02:51PM +0100, David Hildenbrand wrote:
>>>> On 26.03.19 15:08, Igor Mammedov wrote:
>>>>> On Tue, 26 Mar 2019 14:50:58 +1100
>>>>> David Gibson <address@hidden> wrote:
>>>>>
>>>>>> qemu_getrampagesize() works out the minimum host page size backing any of
>>>>>> guest RAM.  This is required in a few places, such as for POWER8 PAPR KVM
>>>>>> guests, because limitations of the hardware virtualization mean the guest
>>>>>> can't use pagesizes larger than the host pages backing its memory.
>>>>>>
>>>>>> However, it currently checks against *every* memory backend, whether or 
>>>>>> not
>>>>>> it is actually mapped into guest memory at the moment.  This is 
>>>>>> incorrect.
>>>>>>
>>>>>> This can cause a problem attempting to add memory to a POWER8 pseries KVM
>>>>>> guest which is configured to allow hugepages in the guest (e.g.
>>>>>> -machine cap-hpt-max-page-size=16m).  If you attempt to add non-hugepage,
>>>>>> you can (correctly) create a memory backend, however it (correctly) will
>>>>>> throw an error when you attempt to map that memory into the guest by
>>>>>> 'device_add'ing a pc-dimm.
>>>>>>
>>>>>> What's not correct is that if you then reset the guest a startup check
>>>>>> against qemu_getrampagesize() will cause a fatal error because of the new
>>>>>> memory object, even though it's not mapped into the guest.
>>>>> I'd say that backend should be remove by mgmt app since device_add failed
>>>>> instead of leaving it to hang around. (but fatal error either not a nice
>>>>> behavior on QEMU part)
>>>>
>>>> Indeed, it should be removed. Depending on the options (huge pages with
>>>> prealloc?) memory might be consumed for unused memory. Undesired.
>>>
>>> Right, but if the guest initiates a reboot before the management gets
>>> to that, we'll have a crash.
>>>
>>
>> Yes, I agree.
>>
>> At least on s390x (extending on what Igor said):
>>
>> mc->init() -> s390_memory_init() ->
>> memory_region_allocate_system_memory() -> host_memory_backend_set_mapped()
>>
>>
>> ac->init_machine() -> kvm_arch_init() ->
>> kvm_s390_configure_mempath_backing() -> qemu_getrampagesize()
>>
>>
>> And in vl.c
>>
>> configure_accelerator(current_machine, argv[0]);
>> ...
>> machine_run_board_init()
>>
>> So memory is indeed not mapped before calling qemu_getrampagesize().
>>
>>
>> We *could* move the call to kvm_s390_configure_mempath_backing() to
>> s390_memory_init().
>>
>> cap_hpage_1m is not needed before we create VCPUs, so this would work fine.
>>
>> We could than eventually make qemu_getrampagesize() asssert if no
>> backends are mapped at all, to catch other user that rely on this being
>> correct.
> 
> So.. I had a look at the usage in kvm_s390_configure_mempath_backing()
> and I'm pretty sure it's broken.  It will work in the case where
> there's only one backend.  And if that's the default -mem-path rather
> than an explicit memory backend then my patch won't break it any
> further.

On the second look, I think I get your point.

1. Why on earth does "find_max_supported_pagesize" find the "minimum
page size". What kind of nasty stuff is this.

2. qemu_mempath_getpagesize() is not affected by your patch and that
seems to be the only thing used on s390x for now.

I sent a patch to move the call on s390x. But we really have to detect
the maximum page size (what find_max_supported_pagesize promises), not
the minimum page size.

> 
> qemu_getrampagesize() returns the smallest host page size for any
> memory backend.  That's what matters for pcc KVM (in several cases)
> because we need certain things to be host-contiguous, not just
> guest-contiguous.  Bigger host page sizes are fine for that purpose,
> clearly.
> 
> AFAICT on s390 you're looking to determine if any backend is using
> hugepages, because KVM may not support that.  The minimum host page
> size isn't adequate to determine that, so qemu_getrampagesize() won't
> tell you what you need.
> 

Indeed.

-- 

Thanks,

David / dhildenb



reply via email to

[Prev in Thread] Current Thread [Next in Thread]