qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] intel-iommu: ignore SNP bit in scalable mode


From: Peter Xu
Subject: Re: [PATCH] intel-iommu: ignore SNP bit in scalable mode
Date: Mon, 29 Nov 2021 09:19:08 +0800

On Sun, Nov 28, 2021 at 07:06:18AM +0000, Liu, Yi L wrote:
> > From: Peter Xu <peterx@redhat.com>
> > Sent: Thursday, November 25, 2021 2:14 PM
> > 
> > On Thu, Nov 25, 2021 at 05:49:38AM +0000, Liu, Yi L wrote:
> > > > From: Peter Xu <peterx@redhat.com>
> > > > Sent: Thursday, November 25, 2021 12:31 PM
> > > >
> > > > On Thu, Nov 25, 2021 at 04:03:34AM +0000, Liu, Yi L wrote:
> > > > > > From: Peter Xu <peterx@redhat.com>
> > > > > > Sent: Wednesday, November 24, 2021 3:57 PM
> > > > > >
> > > > > > On Wed, Nov 24, 2021 at 02:03:09PM +0800, Jason Wang wrote:
> > > > > > > When booting with scalable mode, I hit this error:
> > > > > > >
> > > > > > > qemu-system-x86_64: vtd_iova_to_slpte: detected splte reserve
> > non-
> > > > > > zero iova=0xfffff002, level=0x1slpte=0x102681803)
> > > > > > > qemu-system-x86_64: vtd_iommu_translate: detected translation
> > > > failure
> > > > > > (dev=01:00:00, iova=0xfffff002)
> > > > > > > qemu-system-x86_64: New fault is not recorded due to
> > compression
> > > > of
> > > > > > faults
> > > > > > >
> > > > > > > This is because the SNP bit is set since Linux kernel commit
> > > > > > > 6c00612d0cba1 ("iommu/vt-d: Report right snoop capability when
> > > > using
> > > > > > > FL for IOVA") where SNP bit is set if scalable mode is on though 
> > > > > > > this
> > > > > > > seems to be an violation on the spec which said the SNP bit is
> > > > > > > considered to be reserved if SC is not supported.
> > > > > >
> > > > > > When I was reading that commit, I was actually confused by this
> > change:
> > > > > >
> > > > > > ---8<---
> > > > > > diff --git a/drivers/iommu/intel/iommu.c
> > > > b/drivers/iommu/intel/iommu.c
> > > > > > index 956a02eb40b4..0ee5f1bd8af2 100644
> > > > > > --- a/drivers/iommu/intel/iommu.c
> > > > > > +++ b/drivers/iommu/intel/iommu.c
> > > > > > @@ -658,7 +658,14 @@ static int
> > > > domain_update_iommu_snooping(struct
> > > > > > intel_iommu *skip)
> > > > > >         rcu_read_lock();
> > > > > >         for_each_active_iommu(iommu, drhd) {
> > > > > >                 if (iommu != skip) {
> > > > > > -                       if (!ecap_sc_support(iommu->ecap)) {
> > > > > > +                       /*
> > > > > > +                        * If the hardware is operating in the 
> > > > > > scalable mode,
> > > > > > +                        * the snooping control is always supported 
> > > > > > since we
> > > > > > +                        * always set PASID-table-entry.PGSNP bit 
> > > > > > if the domain
> > > > > > +                        * is managed outside (UNMANAGED).
> > > > > > +                        */
> > > > > > +                       if (!sm_supported(iommu) &&
> > > > > > +                           !ecap_sc_support(iommu->ecap)) {
> > > > > >                                 ret = 0;
> > > > > >                                 break;
> > > > > >                         }
> > > > > > ---8<---
> > > > > >
> > > > > > Does it mean that for some hardwares that has
> > sm_supported()==true,
> > > > it'll
> > > > > > have  SC bit cleared in ecap register?  That sounds odd, and not 
> > > > > > sure
> > why.
> > > > Maybe
> > > > > > Yi Liu or Yi Sun may know?
> > > > >
> > > > > scalable mode has no dependency on SC, so it's possible.
> > > >
> > > > I see; thanks, Yi.
> > > >
> > > > However then OTOH I don't understand above comment
> > > >
> > > >   "If the hardware is operating in the scalable mode, the snooping 
> > > > control
> > is
> > > >    always supported since... "
> > > >
> > > > Because the current qemu vt-d emulation should fall into the case that 
> > > > Yi
> > > > mentioned - we support initial scalable mode but no SC yet..
> > >
> > > chapter 3.9 of 3.2 spec says below.
> > >
> > > “If the remapping hardware is setup in scalable-mode
> > (RTADDR_REG.TTM=01b)
> > > and the Page Snoop (PGSNP) field in PASID-table entry is Set, access to
> > the
> > > final page is snooped.”
> > >
> > > It means the PGSNP field is available under scalable mode. And spec also
> > > says below in chapter 96. of 3.2 spec.
> > >
> > > "Requests snoop processor caches irrespective of, other attributes in the
> > > request or other fields in paging structure entries used to translate the
> > > request."
> > >
> > > It means the PGSNP field of PASID table entry is the first class control
> > > of the snoop behaviour. Also it means the scalable mode has the snoop
> > > control by default. ^_^. So the comment in the above commit is correct
> > > since the policy of intel iommu driver is always setting the PGSNP bit.
> > 
> > I see.  Setting PGSNP bit in the pasid entry looks fine to me.
> > 
> > However IIUC what's triggering the crash (that Jason is fixing) is the guest
> > iommu driver "thinks" SC is supported since scalable is enabled (even if
> > qemu vIOMMU has declared ECAP.SC==0 there), then it'll update
> > iommu_snooping bit, then it'll try to attach the SNP bit in the 2nd level
> > pgtable (intel_iommu_map):
> > 
> >     if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> >             prot |= DMA_PTE_SNP;
> > 
> 
> Above is the fact today.
> 
> > So what I'm wondering is: whether the kernel should _not_ set SNP bit in
> > the 2nd level pgtable, even if we set PGSNP in the pasid entry.. because
> > as you mentioned, the hardware (here the emulated vIOMMU) is allowed to have
> > both scalable==1 but sc==0 so it may recognize PGSNP in pasid entry but not
> > the SNP bit in pgtables.
> 
> This is a weird configuration. :( let's fix it in spec. e.g. PGSNP bit is
> supported only when scalable==1 and sc==1. this makes more sense. right?
> Then SNP bit will always depend on sc bit, it won't have any relationship
> with scalable bit.

Sounds reasonable.

> 
> > 
> > If we'll skip pgtable SNP bit anyway for scalable mode, it looks weird to
> > explicitly set it too.
> > 
> > I think it's fine for Jason's solution to just skip checking SNP bit so we
> > ignore it in qemu, however just to double check we're on the same page.
> 
> If we have above fix in spec, the iommu driver would also update its
> behavior on the SNP bit. then the problem encountered by Jason will
> go away. right?

Looks right to me.

I think we can still have Jason's patch continued because the kernel commit to
apply SNP bit is merged in v5.13, so we may need the qemu change to let it
still work with v5.13-v5.15+ guest kernels.  We'll loose the resv bit check a
bit, but looks worthwhile.  Jason?

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]