qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and


From: Cornelia Huck
Subject: Re: [Qemu-devel] [PATCH 4/9] s390x: refactor error handling for SSCH and RSCH
Date: Thu, 7 Sep 2017 12:24:00 +0200

On Thu, 7 Sep 2017 16:58:31 +0800
Dong Jia Shi <address@hidden> wrote:

> * Halil Pasic <address@hidden> [2017-09-06 16:43:42 +0200]:
> 
> > 
> > 
> > On 09/06/2017 04:20 PM, Cornelia Huck wrote:  
> > > On Wed, 6 Sep 2017 14:25:13 +0200
> > > Halil Pasic <address@hidden> wrote:
> > >   
> > >> We have basically two possibilities/options which ask for different
> > >> handling:
> > >> 1) EFAULT is due to a bug in the vfio-ccw implementation
> > >> (can be QEMU or kernel).
> > >> 2) EFAULT is due to buggy channel program.
> > >>
> > >> Option 2) is basically to be handled with a channel-program check and
> > >> setting primary secondary and alert status. For reference see PoP page
> > >> 15-59 ("Designation of Storage Area").  An exception may be an invalid
> > >> channel program address in the ORB. There the channel-program check ain't
> > >> explicitly stated (although) I would expect one. It may be implied by the
> > >> things on page 15-59 though.
> > >>
> > >> Option 1) is however very similar to other we have figured out that the
> > >> implementation is broken situations and should be handled consequently.
> > >> The current state of the discussion is with a unit exception.
> > >>
> > >> Does that make sense?  
> > > 
> > > I think the situation is slightly different here, though. For the orb
> > > flags, we reject something out of hand because we have not implemented
> > > it, and for that, unit exception sounds like a good fit. Processing
> > > errors, however, are more similar to errors in the hardware, and as
> > > such can probably be reported via something like equipment check.
> > >   
> > 
> > Noted. Let's see what Dong Jia has to say, before we continuing a
> > discussion on something (option 1) what may be irrelevant anyway.
> >   
> > >>
> > >> Now, Dong Jia, I need your help to figure out do we have option 1 or
> > >> option 2 here? After quick look at the kernel code, it appears to me that
> > >> I've seen both option 1 and option 2 (I'm afraid) -- but my assessment
> > >> was really very superficial.  
> There are three cases (all in the kernel) that generate a -EFAULT ret
> code:
> a. vfio_ccw_mdev_write: copy_from_user() fails.
>   This is option 1.
> 
> b. ccwchain_fetch_tic
>   It's mostly likely that the vfio-ccw driver processed the ccw chains
>   wrongly. (Actually I can not think of any other reason.)

Me neither, I'd consider hitting this a bug in the implementation.

>   This is option 1.
> 
> c. ccwchain_fetch_idal
>   When we find that an IDAW contents an invalid address
>   This is option 2.
> 
> > >>
> > >> I would expect option 2 to be handled differently (kernel provides the
> > >> SCSW) though.

So we would do an equipment check for the first two ("equipment", i.e.
the software, is malfunctioning) and use a more appropriate way for the
malformed idaw?



reply via email to

[Prev in Thread] Current Thread [Next in Thread]