qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and


From: Halil Pasic
Subject: Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH
Date: Wed, 6 Sep 2017 00:30:58 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0


On 09/05/2017 06:25 PM, Cornelia Huck wrote:
> On Tue, 5 Sep 2017 17:55:17 +0200
> Halil Pasic <address@hidden> wrote:
> 
>> On 08/31/2017 11:55 AM, Cornelia Huck wrote:
>>> On Wed, 30 Aug 2017 18:36:04 +0200
>>> Halil Pasic <address@hidden> wrote:
>>>   
>>>> Simplify the error handling of the SSCH and RSCH handler avoiding
>>>> arbitrary and cryptic error codes being mapped to what a subchannel is
>>>> supposed to do.  Let the code detecting the condition tell how it's to be
>>>> handled in a less ambiguous way.  It's best to handle SSCH and RSCH in
>>>> one go as the emulation of the two shares a lot of code.  
>>>
>>> So determining the return code at the point in time where we can
>>> instead of trying to map to return codes and back again makes quite a
>>> bit of sense, but I'll have to look at the rest of this.  
>>
>>
>> Looging forward to.
> 
> Once I manage to find some quiet time for thinking :(

I think it's the best if you ignore the rest until v2.

Since we agreed that cc 3 is not the way to go and that the basic
idea is sane, IMHO it does not make much sense to do a thorough
review of this any more.

Not diverting valuable maintainer resources from my indirect data 
access patch set is also a point (IDA is something I have to deliver, while
this is just for fun).
 
> 
>>>> -    ret = s390_ccw_cmd_request(orb, s, sch->driver_data);
>>>> -    switch (ret) {
>>>> -    /* Currently we don't update control block and just return the cc 
>>>> code. */
>>>> -    case 0:
>>>> -        break;
>>>> -    case -EBUSY:
>>>> -        break;
>>>> -    case -ENODEV:
>>>> -        break;
>>>> -    case -EFAULT:
>>>> -         break;
>>>> -    case -EACCES:
>>>> -        /* Let's reflect an inaccessible host device by cc 3. */
>>>> -    default:
>>>> -        /* Let's make all other return codes map to cc 3.  */
>>>> -        ret = -ENODEV;
>>>> -    };
>>>> -
>>>> -    return ret;
>>>> +    s390_ccw_cmd_request(sch);  
>>>
>>> As you change the handling anyway: Don't change this logic in prior
>>> patches?  
>>
>> I preserve the logic as altered by the previous patches (at least,
>> that is the intention). This is the mapping around part which is going
>> away if we push the handling down.
> 
> My point is that you touch the code path twice. But we disagreed on the
> original change anyway :)

Nod.

> 
>>>> -int do_subchannel_work_passthrough(SubchDev *sch)
>>>> +void do_subchannel_work_passthrough(SubchDev *sch)
>>>>  {
>>>> -    int ret;
>>>>      SCSW *s = &sch->curr_status.scsw;
>>>> 
>>>>      if (s->ctrl & SCSW_FCTL_CLEAR_FUNC) {
>>>>          /* TODO: Clear handling */
>>>>          sch_handle_clear_func(sch);
>>>> -        ret = 0;
>>>>      } else if (s->ctrl & SCSW_FCTL_HALT_FUNC) {
>>>>          /* TODO: Halt handling */
>>>>          sch_handle_halt_func(sch);
>>>> -        ret = 0;
>>>>      } else if (s->ctrl & SCSW_FCTL_START_FUNC) {
>>>> -        ret = sch_handle_start_func_passthrough(sch);
>>>> +        sch_handle_start_func_passthrough(sch);
>>>>      } else {
>>>>          /* Cannot happen. */
>>>> -        return -ENODEV;
>>>> +        sch->iret.cc = 3;  
>>>
>>> ftcl == 0 should have been rejected already by the function handlers
>>> here as well, shouldn't it?  
>>
>> Which function handlers. Basically the instruction handlers set fctl
>> and call do_subchannel_work to get the indicated work done.
> 
> Or set. My point is that we can't get here with fctl == 0. So an assert
> sounds better to me.
> 
Yeah, I agree assert is the way to go here.

> 
>>>> diff --git a/include/hw/s390x/css.h b/include/hw/s390x/css.h
>>>> index 5c5fe6b202..d093181a9e 100644
>>>> --- a/include/hw/s390x/css.h
>>>> +++ b/include/hw/s390x/css.h
>>>> @@ -94,13 +94,31 @@ struct SubchDev {
>>>>      /* transport-provided data: */
>>>>      int (*ccw_cb) (SubchDev *, CCW1);
>>>>      void (*disable_cb)(SubchDev *);
>>>> -    int (*do_subchannel_work) (SubchDev *);
>>>> +    void (*do_subchannel_work) (SubchDev *);
>>>>      SenseId id;
>>>>      void *driver_data;
>>>> +    /* io instructions conclude according to iret */
>>>> +    struct {
>>>> +        /*
>>>> +         * General semantic of cc codes of IO instructions is (brief):
>>>> +         * 0 -- produced expected result
>>>> +         * 1 -- produced alternate result
>>>> +         * 2 -- ineffective, because busy with previously initiated 
>>>> function
>>>> +         * 3 -- ineffective, not operational  
>>>
>>> I'm not sure you can summarize this that way in all cases.
>>>
>>> Also, what does "ineffective" mean? If I get a cc 1 for, say, ssch, I
>>> don't expect something either as the subchannel was already status
>>> pending.  
>>
>> You are right cc 1 would have been better off with
>>  'produced alternate result or status pending'
>>
>> I've tried to make this shorter (from PoP 14-2):
>> """
>> Condition Code 0: Instruction execution produced
>> the expected or most probable result. (See “Deferred
>> Condition Code (CC)” on page 9 for a description of
>> conditions that can be encountered subsequent to
>> the presentation of condition code 0 that result in a
>> nonzero deferred condition code.)
>> Condition Code 1: Instruction execution produced
>> the alternate or second-most-probable result, or sta-
>> tus conditions were present that may or may not have
>> prevented the expected result.
>> Condition Code 2: Instruction execution was inef-
>> fective because the designated subchannel or chan-
>> nel-subsystem facility was busy with a previously
>> initiated function.
>> Condition Code 3: Instruction execution was inef-
>> fective because the designated element was not
>> operational or because some condition precluded ini-
>> tiation of the normal function.
>> """
>>
>> Well ineffective means not-effective ;).
> 
> Yes :)
> 
> I think the summary in the PoP is already a bit on the over-generalized
> side, and condensing it further makes it rather ineffective ;) at
> explaining what happens. Frankly, I'd just drop this and rely on
> interested parties referring to the PoP.
> 

That's what I did in the first place, but then Janosch complained.
I will meditate on this (having some sort of explanation is helpful
and I think cc 0 2 and 3 are actually quite OK).

>>
>>>   
>>>> +         */
>>>> +        uint32_t cc:4;
>>>> +        bool pgm_chk:1;  
>>>
>>> This looks weird?>  
>>
>> What do you mean? Any suggestions how to do it better?
> 
> Taking one bit from a bool looks odd. I'd either use a bool normally or
> take one bit from an uint8_t.
> 

I have to think about this.

>>  
>>>> +        uint32_t irq_code;
>>>> +    } iret;
>>>>  };
>>>>  
>>>>  extern const VMStateDescription vmstate_subch_dev;  
> 
>> But I guess, I was afraid of somebody blaming me for this
>> comment (such TODOs in production code are a bit strange -- what
>> does it mean: we did not bother to figure it out?).
> 
> For one, the TODO is preexisting... and we really should remember to
> figure out if there's something better rather than just drop the
> comment.
> 
> (And I sure hope nobody is using vfio-ccw in production yet...)
>
Since blame says the TODO has been around since 2017-05-17
let me have a critical look at it.

At a first glance I would say addressing exception for SSCH
is not what we want: the only possibility I see for address
exception for SSCH is due to the ORB address. But if that's
the case we will never reach the code in question. So I suppose
we can do better.

Adding Ren. @Ren: Do you agree with my analysis. If you do,
I could come up with a proposal what to do -- after some reading.

Regards,
Halil




reply via email to

[Prev in Thread] Current Thread [Next in Thread]