[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] RFC: hcd-ohci: add dma error handling
From: |
Benjamin Herrenschmidt |
Subject: |
Re: [Qemu-devel] [PATCH] RFC: hcd-ohci: add dma error handling |
Date: |
Wed, 17 Jul 2013 22:47:04 +1000 |
On Wed, 2013-07-17 at 14:31 +0200, Alexander Graf wrote:
> On 17.07.2013, at 13:15, Benjamin Herrenschmidt wrote:
>
> > On Wed, 2013-07-17 at 19:46 +1000, Alexey Kardashevskiy wrote:
> >> Current hcd-ohci does not handle DMA errors which can actually
> >> happen.
> >>
> >> However it is not clear what approach should be used here -
> >> for example, get_dwords returns positive number saying that there
> >> is no error as all the callers consider the return value as fail
> >> if it is less than zero. Normally you would expect bool=true/int=0
> >> as success and bool=false/int=-1 as fail.
> >>
> >> Any suggestion?
> >
> > The right thing to do is not only to bring the error up the stack, but
> > essentially to set the error bits in the PCI command status and put the
> > whole HCI in error state (and stop operating)
> >
> > That how real HW reacts.
>
> Who does that? I always assumed it's the IOMMU that kills the device
> when it accesses regions it's not allowed to access.
Hah, no, iommu's only "kill devices" on fancy HW like powerpc :-)
On these, when any kind of error occur, we isolate the entire thing.
> On real hardware, memory transfers don't have error return codes, do they?
No they sort-of do :-)
For example, on PCI, there are 3 common causes of errors: Parity, Target
Aborts and Master Aborts. The former is somewhat obvious, the second
means the target aborted the cycle before completion, the latter usually
means no target responded (timeout). There are two physical lines used
to convey error informations (and potentially abort cycles), PERR and
SERR.
Depending on the details of the bus protocol, the error causes can be a
bit different. On PCIe you can actually shoot error messages up the
link, transactions are packets and can result in an error response,
etc...
Since qemu mostly emulates PCI, let's stick to that. An iommu error will
typically be a target abort. So the device should react as such.
A typical O/EHCI will stop operating, set itself into error state (which
can be queried by MMIO) and will set something like PERR in its config
space to signal that it got an error.
Cheers,
Ben.