qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 07/35] spapr/xive: introduce the XIVE Event Q


From: Cédric Le Goater
Subject: Re: [Qemu-devel] [PATCH v3 07/35] spapr/xive: introduce the XIVE Event Queues
Date: Fri, 4 May 2018 15:29:02 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 05/04/2018 07:19 AM, David Gibson wrote:
> On Thu, May 03, 2018 at 04:37:29PM +0200, Cédric Le Goater wrote:
>> On 05/03/2018 08:25 AM, David Gibson wrote:
>>> On Thu, May 03, 2018 at 08:07:54AM +0200, Cédric Le Goater wrote:
>>>> On 05/03/2018 07:45 AM, David Gibson wrote:
>>>>> On Thu, Apr 26, 2018 at 11:48:06AM +0200, Cédric Le Goater wrote:
>>>>>> On 04/26/2018 09:25 AM, David Gibson wrote:
>>>>>>> On Thu, Apr 19, 2018 at 02:43:03PM +0200, Cédric Le Goater wrote:
>>>>>>>> The Event Queue Descriptor (EQD) table is an internal table of the
>>>>>>>> XIVE routing sub-engine. It specifies on which Event Queue the event
>>>>>>>> data should be posted when an exception occurs (later on pulled by the
>>>>>>>> OS) and which Virtual Processor to notify.
>>>>>>>
>>>>>>> Uhhh.. I thought the IVT said which queue and vp to notify, and the
>>>>>>> EQD gave metadata for event queues.
>>>>>>
>>>>>> yes. the above poorly written. The Event Queue Descriptor contains the
>>>>>> guest address of the event queue in which the data is written. I will 
>>>>>> rephrase.      
>>>>>>
>>>>>> The IVT contains IVEs which indeed define for an IRQ which EQ to notify 
>>>>>> and what data to push on the queue. 
>>>>>>  
>>>>>>>> The Event Queue is a much
>>>>>>>> more complex structure but we start with a simple model for the sPAPR
>>>>>>>> machine.
>>>>>>>>
>>>>>>>> There is one XiveEQ per priority and these are stored under the XIVE
>>>>>>>> virtualization presenter (sPAPRXiveNVT). EQs are simply indexed with :
>>>>>>>>
>>>>>>>>        (server << 3) | (priority & 0x7)
>>>>>>>>
>>>>>>>> This is not in the XIVE architecture but as the EQ index is never
>>>>>>>> exposed to the guest, in the hcalls nor in the device tree, we are
>>>>>>>> free to use what fits best the current model.
>>>>>>
>>>>>> This EQ indexing is important to notice because it will also show up 
>>>>>> in KVM to build the IVE from the KVM irq state.
>>>>>
>>>>> Ok, are you saying that while this combined EQ index will never appear
>>>>> in guest <-> host interfaces, 
>>>>
>>>> Indeed.
>>>>
>>>>> it might show up in qemu <-> KVM interfaces?
>>>>
>>>> Not directly but it is part of the IVE as the IVE_EQ_INDEX field. When
>>>> dumped, it has to be built in some ways, compatible with the emulated 
>>>> mode in QEMU. 
>>>
>>> Hrm.  But is the exact IVE contents visible to qemu (for a PAPR
>>> guest)?  
>>
>> The guest only uses hcalls which arguments are :
>>  
>>      - cpu numbers,
>>      - priority numbers from defined ranges, 
>>      - logical interrupt numbers.  
>>      - physical address of the EQ 
>>
>> The visible parts for the guest of the IVE are the 'priority', the 'cpu', 
>> and the 'eisn', which is the effective IRQ number the guest is assigning 
>> to the source. The 'eisn" will be pushed in the EQ.
> 
> Ok.
> 
>> The IVE EQ index is not visible.
> 
> Good.
> 
>>> I would have thought the qemu <-> KVM interfaces would have
>>> abstracted this the same way the guest <-> KVM interfaces do.  > Or is 
>>> there a reason not to?
>>
>> It is practical to dump 64bit IVEs directly from KVM into the QEMU 
>> internal structures because it fits the emulated mode without doing 
>> any translation ... This might be seen as a shortcut. You will tell 
>> me when you reach the KVM part.   
> 
> Ugh.. exposing to qemu the raw IVEs sounds like a bad idea to me.

You definitely need to in QEMU in emulation mode. The whole routing 
relies on it. 

> When we migrate, we're going to have to assign the guest (server,
> priority) tuples to host EQ indicies, and I think it makes more sense
> to do that in KVM and hide the raw indices from qemu than to have qemu
> mangle them explicitly on migration.

We will need some mangling mechanism for the KVM ioctls saving and
restoring state. This is very similar to XICS. 
 
>>>>>>>> Signed-off-by: Cédric Le Goater <address@hidden>
>>>>>>>
>>>>>>> Is the EQD actually modifiable by a guest?  Or are the settings of the
>>>>>>> EQs fixed by PAPR?
>>>>>>
>>>>>> The guest uses the H_INT_SET_QUEUE_CONFIG hcall to define the address
>>>>>> of the event queue for a couple prio/server.
>>>>>
>>>>> Ok, so the EQD can be modified by the guest.  In which case we need to
>>>>> work out what object owns it, since it'll need to migrate it.
>>>>
>>>> Indeed. The EQD are CPU related as there is one EQD per couple (cpu, 
>>>> priority). The KVM patchset dumps/restores the eight XiveEQ struct 
>>>> using per cpu ioctls. The EQ in the OS RAM is marked dirty at that
>>>> stage.
>>>
>>> To make sure I'm clear: for PAPR there's a strict relationship between
>>> EQD and CPU (one EQD for each (cpu, priority) tuple).  
>>
>> Yes.
>>
>>> But for powernv that's not the case, right?  
>>
>> It is.
> 
> Uh.. I don't think either of us phrased that well, I'm still not sure
> which way you're answering that.

there's a strict relationship between EQD and CPU (one EQD for each (cpu, 
priority) tuple) in spapr and in powernv.

>>> AIUI the mapping of EQs to cpus was configurable, is that right?
>>
>> Each cpu has 8 EQD. Same for virtual cpus.
> 
> Hmm.. but is that 8 EQD per cpu something built into the hardware, or
> just a convention of how the host kernel and OPAL operate?

It's not in the HW, it is used by the HW to route the notification. 
The EQD contains the EQ characteristics :

* functional bits :
  - valid bit
  - enqueue bit, to update OS in RAM EQ or not
  - unconditional notification
  - backlog
  - escalation
  - ...
* OS EQ fields 
  - physical address
  - entry index
  - toggle bit
* NVT fields
  - block/chip
  - index
* etc.

It's a big structure : 8 words.

The EQD table is allocated by OPAL/skiboot and fed to the HW for
its use. The OS powernv uses OPAL calls  configure the EQD with its 
needs : 

int64_t opal_xive_set_queue_info(uint64_t vp, uint32_t prio,
                                 uint64_t qpage,
                                 uint64_t qsize,
                                 uint64_t qflags);


sPAPR uses an hcall :

static long plpar_int_set_queue_config(unsigned long flags,
                                       unsigned long target,
                                       unsigned long priority,
                                       unsigned long qpage,
                                       unsigned long qsize)


but it is translated in an OPAL call in KVM.

C.

 
>  
>>
>> I am not sure what you understood before ? It is surely something
>> I wrote, my XIVE understanding is still making progress.
>>
>>
>> C.
>>
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]