qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH v5 08/36] ppc/xive: introduce a simplified XIVE pr


From: Cédric Le Goater
Subject: Re: [Qemu-ppc] [PATCH v5 08/36] ppc/xive: introduce a simplified XIVE presenter
Date: Mon, 3 Dec 2018 18:05:12 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0

I forgot to reply to this one.

On 11/29/18 1:47 AM, David Gibson wrote:
> On Wed, Nov 28, 2018 at 11:59:58AM +0100, Cédric Le Goater wrote:
>> On 11/28/18 12:49 AM, David Gibson wrote:
>>> On Fri, Nov 16, 2018 at 11:57:01AM +0100, Cédric Le Goater wrote:
>>>> The last sub-engine of the XIVE architecture is the Interrupt
>>>> Virtualization Presentation Engine (IVPE). On HW, they share elements,
>>>> the Power Bus interface (CQ), the routing table descriptors, and they
>>>> can be combined in the same HW logic. We do the same in QEMU and
>>>> combine both engines in the XiveRouter for simplicity.
>>>
>>> Ok, I'm not entirely convinced combining the IVPE and IVRE into a
>>> single object is a good idea, but we can probably discuss that once
>>> I've read further.
>>
>> We could introduce a simplified presenter for sPAPR but I am not even
>> sure of that as it will get more complex if we support the EBB one day. 
> 
> I wasn't really thinking about PAPR for this comment.
> 
>>>> When the IVRE has completed its job of matching an event source with a
>>>> Notification Virtual Target (NVT) to notify, it forwards the event
>>>> notification to the IVPE sub-engine. The IVPE scans the thread
>>>> interrupt contexts of the Notification Virtual Targets (NVT)
>>>> dispatched on the HW processor threads and if a match is found, it
>>>> signals the thread. If not, the IVPE escalates the notification to
>>>> some other targets and records the notification in a backlog queue.
>>>>
>>>> The IVPE maintains the thread interrupt context state for each of its
>>>> NVTs not dispatched on HW processor threads in the Notification
>>>> Virtual Target table (NVTT).
>>>>
>>>> The model currently only supports single NVT notifications.
>>>>
>>>> Signed-off-by: Cédric Le Goater <address@hidden>
>>>> ---
>>>>  include/hw/ppc/xive.h      |  13 +++
>>>>  include/hw/ppc/xive_regs.h |  22 ++++
>>>>  hw/intc/xive.c             | 223 +++++++++++++++++++++++++++++++++++++
>>>>  3 files changed, 258 insertions(+)
>>>>
>>>> diff --git a/include/hw/ppc/xive.h b/include/hw/ppc/xive.h
>>>> index 5987f26ddb98..e715a6c6923d 100644
>>>> --- a/include/hw/ppc/xive.h
>>>> +++ b/include/hw/ppc/xive.h
>>>> @@ -197,6 +197,10 @@ typedef struct XiveRouterClass {
>>>>                     XiveEND *end);
>>>>      int (*set_end)(XiveRouter *xrtr, uint8_t end_blk, uint32_t end_idx,
>>>>                     XiveEND *end);
>>>> +    int (*get_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
>>>> +                   XiveNVT *nvt);
>>>> +    int (*set_nvt)(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t nvt_idx,
>>>> +                   XiveNVT *nvt);
>>>
>>> As with the ENDs, I don't think get/set is a good interface for a
>>> bigger-than-word-size object.
>>
>> We need to agree on this interface before I respin. So you would like 
>> to add a extra argument specifying the word being accessed ?
> 
> Yes.  Ok, 3 options I can see at this point:
> 
> 1) read/write accessors which take a word number
> 
> 2) A "get" accessor which copies the whole structure, but "write"
> accessor which takes a word number.  The asymmetry is a bit ugly, but
> it's the non-atomic writeback of the whole structure which I'm most
> uncomfortable with.
> 
> 3) A map/unmap interface which gives you / releases a pointer to the
> "live" structure.  For powernv that would become
> address_space_map()/unmap().  For PAPR it would just be reutn pointer
> / no-op.

This discussion is in progress in another subthread.

>>
>>>
>>>>  } XiveRouterClass;
>>>>  
>>>>  void xive_eas_pic_print_info(XiveEAS *eas, uint32_t lisn, Monitor *mon);
>>>> @@ -207,6 +211,10 @@ int xive_router_get_end(XiveRouter *xrtr, uint8_t 
>>>> end_blk, uint32_t end_idx,
>>>>                          XiveEND *end);
>>>>  int xive_router_set_end(XiveRouter *xrtr, uint8_t end_blk, uint32_t 
>>>> end_idx,
>>>>                          XiveEND *end);
>>>> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
>>>> nvt_idx,
>>>> +                        XiveNVT *nvt);
>>>> +int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
>>>> nvt_idx,
>>>> +                        XiveNVT *nvt);
>>>>  
>>>>  /*
>>>>   * XIVE END ESBs
>>>> @@ -274,4 +282,9 @@ extern const MemoryRegionOps xive_tm_ops;
>>>>  
>>>>  void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor *mon);
>>>>  
>>>> +static inline uint32_t xive_tctx_cam_line(uint8_t nvt_blk, uint32_t 
>>>> nvt_idx)
>>>> +{
>>>> +    return (nvt_blk << 19) | nvt_idx;
>>>
>>> I'm guessing this formula is the standard way of combining the NVT
>>> block and index into a single word?  
>>
>> That number is the VP/NVT identifier which is written in the CAM value. 
>> The index is on 19 bits because of the NVT  definition in the END 
>> structure. It is being increased to 24 bits on Power10 
>>
>>> If so, I think we should
>>> standardize on passing a single word "nvt_id" around and only
>>> splitting it when we need to use the block separately.  
>>
>> This is really the only place where we concatenate the two NVT values,
>> block and index. 
> 
> Hm, ok.  I know we don't model them (yet, maybe ever) but could
> combined values appear in the PowerBUS messages that handle remote
> notifications?

They do. 
 
>>> Same goes for
>>> the end_id, assuming there's a standard way of putting that into a
>>> single word.  That will address the point I raised earlier about lisn
>>> being passed around as a single word, but these later stage ids being
>>> split.
>>
>> Hmm, I am not sure this is a good option. It is not how the PowerNV 
>> model would use it, skiboot is very much aware of these blocks and 
>> indexes and for remote accesses chips are identified using the block. 
>> I will take a look at it but I am not found of it. I can add helpers 
>> in some places though.    
> 
> Hm, ok.  Do the block and index appear as an (effectively) single
> field in the EAS?

no. In all XIVE structures, block and index are always distinct.

>> I agree we have some kind of issue linking the HW model with the sPAPR 
>> machine. The guest interface is only  about IRQ numbers, priorities and
>> cpu numbers. We really don't care about XIVE blocks and indexes in that 
>> case. we can clarify the code by bypassing the XiveRouter interfaces
>> to the table and directly use the sPAPR interrupt controller. That 
>> should help a bit for the hcalls but we would still have to fill in 
>> the EAT and the END with some index values if we want to use the router
>> algorithm.
> 
> I don't think this is too much of a problem.  These are essentially
> machine internal details so we can choose an allocation to suit us.
> The obvious one is to put everything in a single block, at least as
> long as that won't limit our numbers too much.
> 
>>> We'll probably want some inlines or macros to build an
>>> nvt/end/lisn/whatever id from block and index as well.
>>>
>>>> +}
>>>> +
>>>>  #endif /* PPC_XIVE_H */
>>>> diff --git a/include/hw/ppc/xive_regs.h b/include/hw/ppc/xive_regs.h
>>>> index 2e3d6cb507da..05cb992d2815 100644
>>>> --- a/include/hw/ppc/xive_regs.h
>>>> +++ b/include/hw/ppc/xive_regs.h
>>>> @@ -158,4 +158,26 @@ typedef struct XiveEND {
>>>>  #define END_W7_F1_LOG_SERVER_ID  PPC_BITMASK32(1, 31)
>>>>  } XiveEND;
>>>>  
>>>> +/* Notification Virtual Target (NVT) */
>>>> +typedef struct XiveNVT {
>>>> +        uint32_t        w0;
>>>> +#define NVT_W0_VALID             PPC_BIT32(0)
>>>> +        uint32_t        w1;
>>>> +        uint32_t        w2;
>>>> +        uint32_t        w3;
>>>> +        uint32_t        w4;
>>>> +        uint32_t        w5;
>>>> +        uint32_t        w6;
>>>> +        uint32_t        w7;
>>>> +        uint32_t        w8;
>>>> +#define NVT_W8_GRP_VALID         PPC_BIT32(0)
>>>> +        uint32_t        w9;
>>>> +        uint32_t        wa;
>>>> +        uint32_t        wb;
>>>> +        uint32_t        wc;
>>>> +        uint32_t        wd;
>>>> +        uint32_t        we;
>>>> +        uint32_t        wf;
>>>> +} XiveNVT;
>>>> +
>>>>  #endif /* PPC_XIVE_REGS_H */
>>>> diff --git a/hw/intc/xive.c b/hw/intc/xive.c
>>>> index 4c6cb5d52975..5ba3b06e6e25 100644
>>>> --- a/hw/intc/xive.c
>>>> +++ b/hw/intc/xive.c
>>>> @@ -373,6 +373,32 @@ void xive_tctx_pic_print_info(XiveTCTX *tctx, Monitor 
>>>> *mon)
>>>>      }
>>>>  }
>>>>  
>>>> +/* The HW CAM (23bits) is hardwired to :
>>>> + *
>>>> + *   0x000||0b1||4Bit chip number||7Bit Thread number.
>>>> + *
>>>> + * and when the block grouping extension is enabled :
>>>> + *
>>>> + *   4Bit chip number||0x001||7Bit Thread number.
>>>> + */
>>>> +static uint32_t tctx_hw_cam_line(bool block_group, uint8_t chip_id, 
>>>> uint8_t tid)
>>>> +{
>>>> +    if (block_group) {
>>>> +        return 1 << 11 | (chip_id & 0xf) << 7 | (tid & 0x7f);
>>>> +    } else {
>>>> +        return (chip_id & 0xf) << 11 | 1 << 7 | (tid & 0x7f);
>>>> +    }
>>>> +}
>>>> +
>>>> +static uint32_t xive_tctx_hw_cam_line(XiveTCTX *tctx, bool block_group)
>>>> +{
>>>> +    PowerPCCPU *cpu = POWERPC_CPU(tctx->cs);
>>>> +    CPUPPCState *env = &cpu->env;
>>>> +    uint32_t pir = env->spr_cb[SPR_PIR].default_value;
>>>
>>> I don't much like reaching into the cpu state itself.  I think a
>>> better idea would be to have the TCTX have its HW CAM id set during
>>> initialization (via a property) and then use that.  This will mean
>>> less mucking about if future cpu revisions don't split the PIR into
>>> chip and tid ids in the same way.
>>
>> yes good idea. I will see how to handle the block_group boolean. may be we
>> can leave it out of the model for now as it is not used.
> 
> Yes, it would be nice to leave the block_group stuff as a later
> extensions when/if we need it.  If we put it in as a stub and nothing
> is using/testing it, it's likely it will be broken if we ever do
> actually try to use it.

> 
>>
>>>
>>>> +    return tctx_hw_cam_line(block_group, (pir >> 8) & 0xf, pir & 0x7f);
>>>> +}
>>>> +
>>>>  static void xive_tctx_reset(void *dev)
>>>>  {
>>>>      XiveTCTX *tctx = XIVE_TCTX(dev);
>>>> @@ -1013,6 +1039,195 @@ int xive_router_set_end(XiveRouter *xrtr, uint8_t 
>>>> end_blk, uint32_t end_idx,
>>>>     return xrc->set_end(xrtr, end_blk, end_idx, end);
>>>>  }
>>>>  
>>>> +int xive_router_get_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
>>>> nvt_idx,
>>>> +                        XiveNVT *nvt)
>>>> +{
>>>> +   XiveRouterClass *xrc = XIVE_ROUTER_GET_CLASS(xrtr);
>>>> +
>>>> +   return xrc->get_nvt(xrtr, nvt_blk, nvt_idx, nvt);
>>>> +}
>>>> +
>>>> +int xive_router_set_nvt(XiveRouter *xrtr, uint8_t nvt_blk, uint32_t 
>>>> nvt_idx,
>>>> +                        XiveNVT *nvt)
>>>> +{
>>>> +   XiveRouterClass *xrc = XIVE_ROUTER_GET_CLASS(xrtr);
>>>> +
>>>> +   return xrc->set_nvt(xrtr, nvt_blk, nvt_idx, nvt);
>>>> +}
>>>> +
>>>> +static bool xive_tctx_ring_match(XiveTCTX *tctx, uint8_t ring,
>>>> +                                 uint8_t nvt_blk, uint32_t nvt_idx,
>>>> +                                 bool cam_ignore, uint32_t logic_serv)
>>>> +{
>>>> +    uint8_t *regs = &tctx->regs[ring];
>>>> +    uint32_t w2 = be32_to_cpu(*((uint32_t *) &regs[TM_WORD2]));
>>>> +    uint32_t cam = xive_tctx_cam_line(nvt_blk, nvt_idx);
>>>> +    bool block_group = false; /* TODO (PowerNV) */
>>>> +
>>>> +    /* TODO (PowerNV): ignore low order bits of nvt id */
>>>> +
>>>> +    switch (ring) {
>>>> +    case TM_QW3_HV_PHYS:
>>>> +        return (w2 & TM_QW3W2_VT) && xive_tctx_hw_cam_line(tctx, 
>>>> block_group) ==
>>>> +            tctx_hw_cam_line(block_group, nvt_blk, nvt_idx);
>>>
>>> The difference between "xive_tctx_hw_cam_line" and "tctx_hw_cam_line"
>>> here is far from obvious.  
>>
>> yes. I lacked inspiration ...
> 
> I'd suggest that the one which takes the tctx as a parameter be
> tctx_hw_cam_line() and the other be nvt_hw_cam_line() or similar.  The
> crucial difference here is that one is what the thread is looking for,
> the other is what the NVT is advertising.
> 
>>> Remember that namespacing prefixes aren't
>>> necessary for static functions, which can let you give more
>>> descriptive names without getting excessively long.
>>
>> OK.
>>  
>>>> +    case TM_QW2_HV_POOL:
>>>> +        return (w2 & TM_QW2W2_VP) && (cam == GETFIELD(TM_QW2W2_POOL_CAM, 
>>>> w2));
>>>> +
>>>> +    case TM_QW1_OS:
>>>> +        return (w2 & TM_QW1W2_VO) && (cam == GETFIELD(TM_QW1W2_OS_CAM, 
>>>> w2));
>>>> +
>>>> +    case TM_QW0_USER:
>>>> +        return ((w2 & TM_QW1W2_VO) && (cam == GETFIELD(TM_QW1W2_OS_CAM, 
>>>> w2)) &&
>>>> +                (w2 & TM_QW0W2_VU) &&
>>>> +                (logic_serv == GETFIELD(TM_QW0W2_LOGIC_SERV, w2)));
>>>> +
>>>> +    default:
>>>> +        g_assert_not_reached();
>>>> +    }
>>>> +}
>>>> +
>>>> +static int xive_presenter_tctx_match(XiveTCTX *tctx, uint8_t format,
>>>> +                                     uint8_t nvt_blk, uint32_t nvt_idx,
>>>> +                                     bool cam_ignore, uint32_t logic_serv)
>>>> +{
>>>> +    if (format == 0) {
>>>> +        /* F=0 & i=1: Logical server notification */
>>>> +        if (cam_ignore == true) {
>>>> +            qemu_log_mask(LOG_GUEST_ERROR, "XIVE: no support for LS "
>>>> +                          "NVT %x/%x\n", nvt_blk, nvt_idx);
>>>> +             return -1;
>>>> +        }
>>>> +
>>>> +        /* F=0 & i=0: Specific NVT notification */
>>>> +        if (xive_tctx_ring_match(tctx, TM_QW3_HV_PHYS,
>>>> +                                nvt_blk, nvt_idx, false, 0)) {
>>>> +            return TM_QW3_HV_PHYS;
>>>> +        }
>>>> +        if (xive_tctx_ring_match(tctx, TM_QW2_HV_POOL,
>>>> +                                nvt_blk, nvt_idx, false, 0)) {
>>>> +            return TM_QW2_HV_POOL;
>>>> +        }
>>>> +        if (xive_tctx_ring_match(tctx, TM_QW1_OS,
>>>> +                                nvt_blk, nvt_idx, false, 0)) {
>>>> +            return TM_QW1_OS;
>>>> +        }
>>>
>>> Hm.  It's a bit pointless to iterate through each ring calling a
>>> common function, when that "common" function consists entirely of a
>>> switch which makes it not really common at all.
>>>
>>> So I think you want separate helper functions for each ring's match,
>>> or even just fold the previous function into this one.
>>
>> yes. It can be improved. I did try different layouts. I might just fold 
>> both routine in one as you propose.  
>>
>>>> +    } else {
>>>> +        /* F=1 : User level Event-Based Branch (EBB) notification */
>>>> +        if (xive_tctx_ring_match(tctx, TM_QW0_USER,
>>>> +                                nvt_blk, nvt_idx, false, logic_serv)) {
>>>> +            return TM_QW0_USER;
>>>> +        }
>>>> +    }
>>>> +    return -1;
>>>> +}
>>>> +
>>>> +typedef struct XiveTCTXMatch {
>>>> +    XiveTCTX *tctx;
>>>> +    uint8_t ring;
>>>> +} XiveTCTXMatch;
>>>> +
>>>> +static bool xive_presenter_match(XiveRouter *xrtr, uint8_t format,
>>>> +                                 uint8_t nvt_blk, uint32_t nvt_idx,
>>>> +                                 bool cam_ignore, uint8_t priority,
>>>> +                                 uint32_t logic_serv, XiveTCTXMatch 
>>>> *match)
>>>> +{
>>>> +    CPUState *cs;
>>>> +
>>>> +    /* TODO (PowerNV): handle chip_id overwrite of block field for
>>>> +     * hardwired CAM compares */
>>>> +
>>>> +    CPU_FOREACH(cs) {
>>>> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
>>>> +        XiveTCTX *tctx = XIVE_TCTX(cpu->intc);
>>>> +        int ring;
>>>> +
>>>> +        /*
>>>> +         * HW checks that the CPU is enabled in the Physical Thread
>>>> +         * Enable Register (PTER).
>>>> +         */
>>>> +
>>>> +        /*
>>>> +         * Check the thread context CAM lines and record matches. We
>>>> +         * will handle CPU exception delivery later
>>>> +         */
>>>> +        ring = xive_presenter_tctx_match(tctx, format, nvt_blk, nvt_idx,
>>>> +                                         cam_ignore, logic_serv);
>>>> +        /*
>>>> +         * Save the context and follow on to catch duplicates, that we
>>>> +         * don't support yet.
>>>> +         */
>>>> +        if (ring != -1) {
>>>> +            if (match->tctx) {
>>>> +                qemu_log_mask(LOG_GUEST_ERROR, "XIVE: already found a 
>>>> thread "
>>>> +                              "context NVT %x/%x\n", nvt_blk, nvt_idx);
>>>> +                return false;
>>>> +            }
>>>> +
>>>> +            match->ring = ring;
>>>> +            match->tctx = tctx;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    if (!match->tctx) {
>>>> +        qemu_log_mask(LOG_GUEST_ERROR, "XIVE: NVT %x/%x is not 
>>>> dispatched\n",
>>>> +                      nvt_blk, nvt_idx);
>>>> +        return false;
>>>
>>> Hmm.. this isn't actually an error isn't it? At least not for powernv
>>
>> It is on sPAPR, it would mean the END was configured with an unknow CPU. 
> 
> Right.
> 
>> It is not error on PowerNV, when we support escalations.
>>
>>> - that just means the NVT isn't currently dispatched, so we'll need to
>>> trigger the escalation interrupt.  
>>
>> Yes.
>>
>>> Does this get changed later in the series?
>>
>> No.
> 
> But this code is common to PAPR and powernv, yes, so it will need to?

When we add support for escalations, yes, it will change. Would you rather
use an error_report() until then ? 

C.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]