[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH v3 1/3] memory: introduce IOMMUNotifier and its
From: |
Peter Xu |
Subject: |
Re: [Qemu-devel] [PATCH v3 1/3] memory: introduce IOMMUNotifier and its caps |
Date: |
Thu, 8 Sep 2016 18:00:03 +0800 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
On Wed, Sep 07, 2016 at 08:20:35PM +1000, David Gibson wrote:
> On Wed, Sep 07, 2016 at 03:09:16PM +0800, Peter Xu wrote:
> > On Wed, Sep 07, 2016 at 04:02:39PM +1000, David Gibson wrote:
> > > On Wed, Sep 07, 2016 at 01:32:22PM +0800, Peter Xu wrote:
> > > > IOMMU Notifier list is used for notifying IO address mapping changes.
> > > > Currently VFIO is the only user.
> > > >
> > > > However it is possible that future consumer like vhost would like to
> > > > only listen to part of its notifications (e.g., cache invalidations).
> > > >
> > > > This patch introduced IOMMUNotifier and IOMMUNotfierCap bits for a finer
> > > > grained control of it.
> > > >
> > > > IOMMUNotifier contains a bitfield for the notify consumer describing
> > > > what kind of notification it is interested in. Currently two kinds of
> > > > notifications are defined:
> > > >
> > > > - IOMMU_NOTIFIER_CHANGE: for entry changes (additions)
> > > > - IOMMU_NOTIFIER_INVALIDATION: for entry removals (cache invalidates)
> > >
> > > As noted on the other thread, I think the correct options for your
> > > bitmap here are "map" and "unmap". Which are triggered depends on the
> > > permissions / existence of the *previous* mapping, as well as the new
> > > one.
> >
> > As mentioned in previous reply, both work for me. :)
> >
> > If you insist on changing it (without anyone that strongly
> > disagree...), I can do it in next spin.
>
> This is not just a name change I'm proposing, but a semantic change
> (or at least a clarification).
I see that kernel IOMMU driver is using map_page() and unmap_page()
for its interface. Now I prefer MAP/UNMAP.
>
> > > You could in fact have "map-read", "map-write", "unmap-read",
> > > "unmap-write" as separate bitmap options (e.g. changing a mapping from
> > > RO to WO would be both a read-unmap and write-map event). I can't see
> > > any real use for that though, so just "map" and "unmap" are probably
> > > sufficient.
> >
> > Agreed. We can enhance it in the future if there is any real
> > requirement. Before that, it would be over-engineering.
> >
> > (Btw, we should not need {read|write}_unmap in all cases. IIUC unmap
> > should not need any permission check.)
>
> In practice probably not, but they are distinct operations.
> read_unmap means a readable mapping has been removed, write_unmap
> means a writable mapping has been removed. Again - the permissions on
> the *old* mapping are what matters here.
Why they are distinct operations? Or could you help explain in what
case would we need this flag (read/write) for an unmap() operation?
>
> >
> > [...]
> >
> > > > -static void vfio_iommu_map_notify(Notifier *n, void *data)
> > > > +static void vfio_iommu_map_notify(IOMMUNotifier *n, void *data)
> > > > {
> > > > VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n);
> > > > VFIOContainer *container = giommu->container;
> > > > @@ -454,6 +454,7 @@ static void vfio_listener_region_add(MemoryListener
> > > > *listener,
> > > > section->offset_within_region;
> > > > giommu->container = container;
> > > > giommu->n.notify = vfio_iommu_map_notify;
> > > > + giommu->n.notifier_caps = IOMMU_NOTIFIER_ALL;
> > >
> > > "caps" isn't really right. It's a *requirement* that VFIO get all the
> > > notifications, not a capability. "caps" would only make sense on the
> > > other side (the vIOMMU implementation).
> >
> > Sounds reasonable. How about "flags"? Or any suggestion?
>
> "flags" would do. I feel like there should be a better name, but I
> can't think of it.
Sure. I can switch.
>
> > [...]
> >
> > > > /**
> > > > @@ -620,11 +642,12 @@ void memory_region_notify_iommu(MemoryRegion *mr,
> > > > * IOMMU translation entries.
> > > > *
> > > > * @mr: the memory region to observe
> > > > - * @n: the notifier to be added; the notifier receives a pointer to an
> > > > - * #IOMMUTLBEntry as the opaque value; the pointer ceases to be
> > > > - * valid on exit from the notifier.
> > > > + * @n: the IOMMUNotifier to be added; the notify callback receives a
> > > > + * pointer to an #IOMMUTLBEntry as the opaque value; the pointer
> > > > + * ceases to be valid on exit from the notifier.
> > > > */
> > > > -void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier
> > > > *n);
> > > > +void memory_region_register_iommu_notifier(MemoryRegion *mr,
> > > > + IOMMUNotifier *n);
> > >
> > > It seems to me that this should be allowed to fail, if the notifier
> > > you're trying to register requires notifications that the MR
> > > implementation can't supply. That seems cleaner than delaying the
> > > checking until the notification actually happens.
> >
> > Do we have real use case for this one? For example, do we have case
> > that VM "registering required notifications that MR cannot support"
> > can still work?
> >
> > Currently there is only one use case AFAIK, which is VFIO+IntelIOMMU.
> > In that case, I take it a configuration error (we should never allow
> > that configuration happen). IMHO All configuration errors should be
> > reported ASAP, and we should never let VM start.
>
> Yes... I'm not proposing changing that. I just think it would be
> cleaner to report the error through the error channels, instead of
> just aborting.
Agree. But since I have no obvious reason to change the return code,
I'd prefer to keep it as it is for this series if you don't mind.
[...]
> > > > @@ -1564,8 +1569,22 @@ void
> > > > memory_region_unregister_iommu_notifier(MemoryRegion *mr, Notifier *n)
> > > > void memory_region_notify_iommu(MemoryRegion *mr,
> > > > IOMMUTLBEntry entry)
> > > > {
> > > > + IOMMUNotifier *iommu_notifier;
> > > > + IOMMUNotifierCap request_cap;
> > > > +
> > > > assert(memory_region_is_iommu(mr));
> > > > - notifier_list_notify(&mr->iommu_notify, &entry);
> > > > +
> > > > + if (entry.perm & IOMMU_RW) {
> > > > + request_cap = IOMMU_NOTIFIER_CHANGE;
> > > > + } else {
> > > > + request_cap = IOMMU_NOTIFIER_INVALIDATION;
> > > > + }
> > >
> > > As noted right at the top, I don't think this logic is really right.
> > > An in-place change should be treated as both a map and unmap.
> >
> > IIUC, guest needs to send two invalidations for an in-place change,
> > right? So a "change" will actually trigger this twice.
>
> No, or at least not necessarily. How the invalidate is reported
> really depends on the guest side vIOMMU interface. Maybe it requires
> an explicit invalidate followed by a set, maybe a direct in place
> replacement is possible (the PAPR hypercall interface allows this at
> least in theory).
>
> What I had in mind here is that assuming the vIOMMU can detect an in
> place change, it would then ping all notifiers that have *either* the
> MAP or UNMAP flag bits set.
Yes, so I think we don't need a "change" interface. And maybe we
should start using MAP/UNMAP for the flags naming to avoid unecessary
misunderstandings (when we talk about CHANGE flag, it's actually
map(), but not unmap() + map()).
>
> > (At least this should be true for Power, and I am guessing the same
> > for Intel, otherwise PowerIOMMU+VFIO should not work before this
> > series...)
>
> I don't really follow what you're saying here.
I meant at least current Power IOMMU should be sending two notifies if
a "change" happened, otherwise current code won't work.
Thanks,
-- peterx
[Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add, Peter Xu, 2016/09/07
- Re: [Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add, David Gibson, 2016/09/07
- Re: [Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add, Paolo Bonzini, 2016/09/07
- Re: [Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add, Peter Xu, 2016/09/08
- Re: [Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add, David Gibson, 2016/09/11
- Re: [Qemu-devel] [PATCH v3 2/3] memory: generalize iommu_ops.notify_started to notifier_add, Peter Xu, 2016/09/12
[Qemu-devel] [PATCH v3 3/3] intel_iommu: allow invalidation typed notifiers, Peter Xu, 2016/09/07