qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: DMAR fault with qemu 7.2 and Ubuntu 22.04 base image


From: Klaus Jensen
Subject: Re: DMAR fault with qemu 7.2 and Ubuntu 22.04 base image
Date: Wed, 15 Feb 2023 08:35:57 +0100

On Feb 15 12:01, Major Saheb wrote:
> > Assuming you are *not* explicitly configuring shadow doorbells, then I
> > think you might have a broken driver that does not properly reset the
> > controller before using it (are you tripping CC.EN?). That could explain
> > the admin queue size of 32 (default admin queue depth for the Linux nvme
> > driver) as well as the db/ei_addrs being left over. And behavior wrt.
> > how the Linux driver disables the device might have changed between the
> > kernel version used in Ubuntu 20.04 and 22.04.
> 
> Thanks Klaus, I didn't had the driver source, so I acquired it and
> looked into it, the driver was not toggling the cc.en nor waiting for
> csts.ready the right way. So I implemented it and it started working
> perfectly.
> - R
> 
> On Tue, Feb 14, 2023 at 8:26 PM Klaus Jensen <its@irrelevant.dk> wrote:
> >
> > On Feb 14 14:05, Klaus Jensen wrote:
> > > On Feb 14 17:34, Major Saheb wrote:
> > > > Thanks Peter for the reply. I tried to connect gdb to qemu and able to
> > > > break 'vtd_iova_to_slpte()', I dumped the following with both Ubuntu
> > > > 20.04 base image container which is the success case and Ubuntu 22.04
> > > > base image container which is failure case
> > > > One thing I observed is the NvmeSQueue::dma_addr is correctly set to
> > > > '0x800000000', however in failure case this value is 0x1196b1000. A
> > > > closer look indicates more fields in NvmeSQueue might be corrupted,
> > > > for example we are setting admin queue size as 512 but in failure case
> > > > it is showing 32.
> > > >
> > >
> > > Hi Major,
> > >
> > > It's obviously pretty bad if hw/nvme somehow corrupts the SQ structure,
> > > but it's difficult to say from this output.
> > >
> > > Are you configuring shadow doorbells (the db_addr and ei_addr's are
> > > set in both cases)?
> > >
> > > > > > Following is the partial qemu command line that I am using
> > > > > >
> > > > > > -device 
> > > > > > intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on,aw-bits=48
> > > > > >
> > >
> > > I'm not sure if caching-mode=on and device-iotlb=on leads to any issues
> > > here? As far as I understand, this is mostly used with stuff like vhost.
> > > I've tested and developed vfio-based drivers against hw/nvme excessively
> > > and I'm not using anything besides `-device intel-iommu`.
> > >
> > > Do I undestand correctly that your setup is "just" a Ubuntu 22.04 guest
> > > with a container and a user-space driver to interact with the nvme
> > > devices available on the guest? No nested virtualization with vfio
> > > passthrough?
> >
> > Assuming you are *not* explicitly configuring shadow doorbells, then I
> > think you might have a broken driver that does not properly reset the
> > controller before using it (are you tripping CC.EN?). That could explain
> > the admin queue size of 32 (default admin queue depth for the Linux nvme
> > driver) as well as the db/ei_addrs being left over. And behavior wrt.
> > how the Linux driver disables the device might have changed between the
> > kernel version used in Ubuntu 20.04 and 22.04.

Awesome. Occam's Razor strikes again ;)

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]