qemu-s390x
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [qemu-s390x] [PATCH v4] s390: diagnose 318 info reset and migration


From: Christian Borntraeger
Subject: Re: [qemu-s390x] [PATCH v4] s390: diagnose 318 info reset and migration support
Date: Tue, 14 May 2019 11:07:41 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1


On 14.05.19 10:59, David Hildenbrand wrote:
> On 14.05.19 10:49, Cornelia Huck wrote:
>> On Tue, 14 May 2019 10:37:32 +0200
>> Christian Borntraeger <address@hidden> wrote:
>>
>>> On 14.05.19 09:28, David Hildenbrand wrote:
>>>>>>> But that can be tested using the runability information if I am not 
>>>>>>> wrong.  
>>>>>>
>>>>>> You mean the cpu level information, right?  
>>>>
>>>> Yes, query-cpu-definition includes for each model runability information
>>>> via "unavailable-features" (valid under the started QEMU machine).
>>>>   
>>>>>>  
>>>>>>>  
>>>>>>>> and others that we have today.
>>>>>>>>
>>>>>>>> So yes, I think this would be acceptable.    
>>>>>>>
>>>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in
>>>>>>> production either way. But you never know.  
>>>>>>
>>>>>> I think that using that many cpus is a more uncommon setup, but I still
>>>>>> think that having to wait for actual failure  
>>>>>
>>>>> That can happen all the time today. You can easily say z14 in the xml 
>>>>> when 
>>>>> on a zEC12. Only at startup you get the error. The question is really:  
>>>>
>>>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12"
>>>> will work. Actually, even "-smp 248" will no longer work on affected
>>>> machines.
>>>>
>>>> That is why wonder if it is better to disable the feature and print a
>>>> warning. Similar to CMMA, where want want to tolerate when CMMA is not
>>>> possible in the current environment (huge pages).
>>>>
>>>> "Diag318 will not be enabled because it is not compatible with more than
>>>> 240 CPUs".
>>>>
>>>> However, I still think that implementing support for more than one SCLP
>>>> response page is the best solution. Guests will need adaptions for > 240
>>>> CPUs with Diag318, but who cares? Existing setups will continue to work.
>>>>
>>>> Implementing that SCLP thingy will avoid any warnings and any errors. It
>>>> just works from the QEMU perspective.
>>>>
>>>> Is implementing this realistic?  
>>>
>>> Yes it is but it will take time. I will try to get this rolling. To make
>>> progress on the diag318 thing, can we error on startup now and simply
>>> remove that check when when have implemented a larger sccb? If we would
>>> now do all kinds of "change the max number games" would be harder to "fix".
>>
>> So, the idea right now is:
>>
>> - fail to start if you try to specify a diag318 device and more than
>>   240 cpus (do we need a knob to turn off the device?)
>> - in the future, support more than one SCLP response page
>>
>> I'm getting a bit lost in the discussion; but the above sounds
>> reasonable to me.
>>
> 
> We can
> 
> 1. Fail to start with #cpus > 240 when diag318=on
> 2. Remove the error once we support more than one SCLP response page
> 
> Or
> 
> 1. Allow to start with #cpus > 240 when diag318=on, but indicate only
>    240 CPUs via SCLP
> 2. Print a warning
> 3. Remove the restriction and the warning once we support more than one
>    SCLP response page
> 
> While I prefer the second approach (similar to defining zPCI devices
> without zpci=on), I could also live with the first approach.

Lets just continue with your other suggestion to simply limit the sclp 
response and do not do any failure or machine change. That  seems like
the easiest solution.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]