[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: t
From: |
Alex Williamson |
Subject: |
Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot |
Date: |
Tue, 18 Oct 2016 08:59:18 -0600 |
On Tue, 18 Oct 2016 20:38:21 +0800
Jike Song <address@hidden> wrote:
> On 10/18/2016 12:02 AM, Alex Williamson wrote:
> > On Fri, 14 Oct 2016 15:19:01 -0700
> > Neo Jia <address@hidden> wrote:
> >
> >> On Fri, Oct 14, 2016 at 10:51:24AM -0600, Alex Williamson wrote:
> >>> On Fri, 14 Oct 2016 09:35:45 -0700
> >>> Neo Jia <address@hidden> wrote:
> >>>
> >>>> On Fri, Oct 14, 2016 at 08:46:01AM -0600, Alex Williamson wrote:
> >>>>> On Fri, 14 Oct 2016 08:41:58 -0600
> >>>>> Alex Williamson <address@hidden> wrote:
> >>>>>
> >>>>>> On Fri, 14 Oct 2016 18:37:45 +0800
> >>>>>> Jike Song <address@hidden> wrote:
> >>>>>>
> >>>>>>> On 10/11/2016 05:47 PM, Paolo Bonzini wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 11/10/2016 11:21, Xiao Guangrong wrote:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 10/11/2016 04:54 PM, Paolo Bonzini wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 11/10/2016 04:39, Xiao Guangrong wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 10/11/2016 02:32 AM, Paolo Bonzini wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 10/10/2016 20:01, Neo Jia wrote:
> >>>>>>>>>>>>>> Hi Neo,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> AFAIK this is needed because KVMGT doesn't paravirtualize the
> >>>>>>>>>>>>>> PPGTT,
> >>>>>>>>>>>>>> while nVidia does.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Paolo and Xiaoguang,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I am just wondering how device driver can register a notifier
> >>>>>>>>>>>>> so he
> >>>>>>>>>>>>> can be
> >>>>>>>>>>>>> notified for write-protected pages when writes are happening.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> It can't yet, but the API is ready for that. kvm_vfio_set_group
> >>>>>>>>>>>> is
> >>>>>>>>>>>> currently where a struct kvm_device* and struct vfio_group*
> >>>>>>>>>>>> touch.
> >>>>>>>>>>>> Given
> >>>>>>>>>>>> a struct kvm_device*, dev->kvm provides the struct kvm to be
> >>>>>>>>>>>> passed to
> >>>>>>>>>>>> kvm_page_track_register_notifier. So I guess you could add a
> >>>>>>>>>>>> callback
> >>>>>>>>>>>> that passes the struct kvm_device* to the mdev device.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Xiaoguang and Guangrong, what were your plans? We discussed it
> >>>>>>>>>>>> briefly
> >>>>>>>>>>>> at KVM Forum but I don't remember the details.
> >>>>>>>>>>>
> >>>>>>>>>>> Your suggestion was that pass kvm fd to KVMGT via VFIO, so that
> >>>>>>>>>>> we can
> >>>>>>>>>>> figure out the kvm instance based on the fd.
> >>>>>>>>>>>
> >>>>>>>>>>> We got a new idea, how about search the kvm instance by
> >>>>>>>>>>> mm_struct, it
> >>>>>>>>>>> can work as KVMGT is running in the vcpu context and it is much
> >>>>>>>>>>> more
> >>>>>>>>>>> straightforward.
> >>>>>>>>>>
> >>>>>>>>>> Perhaps I didn't understand your suggestion, but the same
> >>>>>>>>>> mm_struct can
> >>>>>>>>>> have more than 1 struct kvm so I'm not sure that it can work.
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> vcpu->pid is valid during vcpu running so that it can be used to
> >>>>>>>>> figure
> >>>>>>>>> out which kvm instance owns the vcpu whose pid is the one as current
> >>>>>>>>> thread, i think it can work. :)
> >>>>>>>>
> >>>>>>>> No, don't do that. There's no reason for a thread to run a single
> >>>>>>>> VCPU,
> >>>>>>>> and if you can have multiple VCPUs you can also have multiple VCPUs
> >>>>>>>> from
> >>>>>>>> multiple VMs.
> >>>>>>>>
> >>>>>>>> Passing file descriptors around are the right way to connect
> >>>>>>>> subsystems.
> >>>>>>>
> >>>>>>> [CC Alex, Kevin and Qemu-devel]
> >>>>>>>
> >>>>>>> Hi Paolo & Alex,
> >>>>>>>
> >>>>>>> IIUC, passing file descriptors means touching QEMU and the UAPI
> >>>>>>> between
> >>>>>>> QEMU and VFIO. Would you guys have a look at below draft patch? If
> >>>>>>> it's
> >>>>>>> on the correct direction, I'll send the split ones. Thanks!
> >>>>>>>
> >>>>>>> --
> >>>>>>> Thanks,
> >>>>>>> Jike
> >>>>>>>
> >>>>>>>
> >>>>>>> diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
> >>>>>>> index bec694c..f715d37 100644
> >>>>>>> --- a/hw/vfio/pci-quirks.c
> >>>>>>> +++ b/hw/vfio/pci-quirks.c
> >>>>>>> @@ -10,12 +10,14 @@
> >>>>>>> * the COPYING file in the top-level directory.
> >>>>>>> */
> >>>>>>>
> >>>>>>> +#include <sys/ioctl.h>
> >>>>>>> #include "qemu/osdep.h"
> >>>>>>> #include "qemu/error-report.h"
> >>>>>>> #include "qemu/range.h"
> >>>>>>> #include "qapi/error.h"
> >>>>>>> #include "hw/nvram/fw_cfg.h"
> >>>>>>> #include "pci.h"
> >>>>>>> +#include "sysemu/kvm.h"
> >>>>>>> #include "trace.h"
> >>>>>>>
> >>>>>>> /* Use uin32_t for vendor & device so PCI_ANY_ID expands and cannot
> >>>>>>> match hw */
> >>>>>>> @@ -1844,3 +1846,15 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice
> >>>>>>> *vdev)
> >>>>>>> break;
> >>>>>>> }
> >>>>>>> }
> >>>>>>> +
> >>>>>>> +void vfio_quirk_kvmgt(VFIOPCIDevice *vdev)
> >>>>>>> +{
> >>>>>>> + int vmfd;
> >>>>>>> +
> >>>>>>> + if (!kvm_enabled() || !vdev->kvmgt)
> >>>>>>> + return;
> >>>>>>> +
> >>>>>>> + /* Tell the device what KVM it attached */
> >>>>>>> + vmfd = kvm_get_vmfd(kvm_state);
> >>>>>>> + ioctl(vdev->vbasedev.fd, VFIO_SET_KVMFD, vmfd);
> >>>>>>> +}
> >>>>>>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >>>>>>> index a5a620a..8732552 100644
> >>>>>>> --- a/hw/vfio/pci.c
> >>>>>>> +++ b/hw/vfio/pci.c
> >>>>>>> @@ -2561,6 +2561,8 @@ static int vfio_initfn(PCIDevice *pdev)
> >>>>>>> return ret;
> >>>>>>> }
> >>>>>>>
> >>>>>>> + vfio_quirk_kvmgt(vdev);
> >>>>>>> +
> >>>>>>> /* Get a copy of config space */
> >>>>>>> ret = pread(vdev->vbasedev.fd, vdev->pdev.config,
> >>>>>>> MIN(pci_config_size(&vdev->pdev), vdev->config_size),
> >>>>>>> @@ -2832,6 +2834,7 @@ static Property vfio_pci_dev_properties[] = {
> >>>>>>> DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
> >>>>>>> sub_device_id, PCI_ANY_ID),
> >>>>>>> DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0),
> >>>>>>> + DEFINE_PROP_BOOL("kvmgt", VFIOPCIDevice, kvmgt, false),
> >>>>>>
> >>>>>> Just a side note, device options are a headache, users are prone to get
> >>>>>> them wrong and minimally it requires an entire round to get libvirt
> >>>>>> support. We should be able to detect from the device or vfio API
> >>>>>> whether such a call is required. Obviously if we can use the existing
> >>>>>> kvm-vfio device, that's the better option anyway. Thanks,
> >>>>>
> >>>>> Also, vfio devices currently have no hard dependencies on KVM, if kvmgt
> >>>>> does, it needs to produce a device failure when unavailable. Thanks,
> >>>>>
> >>>>
> >>>> Also, I would like to see this as an generic feature instead of
> >>>> kvmgt specific interface, so we don't have to add new options to QEMU
> >>>> and it is
> >>>> up to the vendor driver to proceed with or without it.
> >>>
> >>> In general this should be decided by lack of some required feature
> >>> exclusively provided by KVM. I would not want to add a generic opt-out
> >>> for mdev vendor drivers to decide that they arbitrarily want to disable
> >>> that path. Thanks,
> >>
> >> IIUC, you are suggesting that this path should be controlled by KVM
> >> feature cap
> >> and it will be accessible to VFIO users when such checking is satisfied.
> >
> > Maybe we're getting too loose with our pronouns here, I'm starting to
> > lose track of what "this" is referring to. I agree that there's no
> > reason for the ioctl, as proposed to be kvmgt specific. I would hope
> > that going through the kvm-vfio device to create that linkage would
> > eliminate that, but we'll need to see what Jike can come up with to
> > plumb between KVM and vfio. Vendor drivers can implement their own
> > ioctls, now that we pass them through the mdev layer, but someone needs
> > to call those ioctls. Ideally we want something programmatic to
> > trigger that, without requiring a user to pass an extra device
> > parameter. Additionally, if there is any hope of making use of the
> > device with userspace drivers other than QEMU, hard dependencies on KVM
> > should be avoided. Thanks,
> >
> > Alex
> >
>
> Thanks for the advice, so I cooked another patch for your comments.
> Basically a 'void *usrdata' is added to vfio_group, external users
> can set it (kvm) or get it (kvm or other users like kvmgt).
>
> BTW, in device-model, the open method will return failure to vfio-mdev
> in case that such kvm information is not available.
>
> --
> Thanks,
> Jike
>
>
>
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index d1d70e0..6b8d1d2 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -86,6 +86,7 @@ struct vfio_group {
> struct mutex unbound_lock;
> atomic_t opened;
> bool noiommu;
> + void *usrdata;
> };
>
> struct vfio_device {
> @@ -447,14 +448,13 @@ static struct vfio_group *vfio_group_try_get(struct
> vfio_group *group)
> }
>
> static
> -struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group)
> +struct vfio_group *__vfio_group_get_from_iommu(struct iommu_group
> *iommu_group)
> {
> struct vfio_group *group;
>
> mutex_lock(&vfio.group_lock);
> list_for_each_entry(group, &vfio.group_list, vfio_next) {
> if (group->iommu_group == iommu_group) {
> - vfio_group_get(group);
This is wrong, we can't add our reference after we release the lock.
> mutex_unlock(&vfio.group_lock);
> return group;
> }
> @@ -464,6 +464,17 @@ struct vfio_group *vfio_group_get_from_iommu(struct
> iommu_group *iommu_group)
> return NULL;
> }
>
> +static
> +struct vfio_group *vfio_group_get_from_iommu(struct iommu_group *iommu_group)
> +{
> + struct vfio_group *group = __vfio_group_get_from_iommu(iommu_group);
> + if (!group)
> + return NULL;
> +
> + vfio_group_get(group);
We have no basis to get a reference here. This function cannot exist
separate from the existing function above.
> + return group;
> +}
> +
> static struct vfio_group *vfio_group_get_from_minor(int minor)
> {
> struct vfio_group *group;
> @@ -1728,6 +1739,31 @@ long vfio_external_check_extension(struct vfio_group
> *group, unsigned long arg)
> }
> EXPORT_SYMBOL_GPL(vfio_external_check_extension);
>
> +void vfio_group_set_usrdata(struct vfio_group *group, void *data)
> +{
> + group->usrdata = data;
> +}
> +EXPORT_SYMBOL_GPL(vfio_group_set_usrdata);
> +
> +void *vfio_group_get_usrdata(struct vfio_group *group)
> +{
> + return group->usrdata;
> +}
> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata);
> +
> +void *vfio_group_get_usrdata_by_device(struct device *dev)
> +{
> + struct vfio_group *vfio_group;
> +
> + vfio_group = __vfio_group_get_from_iommu(dev->iommu_group);
We actually need to use iommu_group_get() here. Kirti adds a
vfio_group_get_from_dev() in v9 03/12 that does this properly.
> + if (!vfio_group)
> + return NULL;
> +
> + return vfio_group_get_usrdata(vfio_group);
This operates on a group for which we have no reference.
> +}
> +EXPORT_SYMBOL_GPL(vfio_group_get_usrdata_by_device);
> +
> +
> /**
> * Sub-module support
> */
> diff --git a/include/linux/vfio.h b/include/linux/vfio.h
> index 0ecae0b..712588f 100644
> --- a/include/linux/vfio.h
> +++ b/include/linux/vfio.h
> @@ -91,6 +91,10 @@ extern void vfio_unregister_iommu_driver(
> extern int vfio_external_user_iommu_id(struct vfio_group *group);
> extern long vfio_external_check_extension(struct vfio_group *group,
> unsigned long arg);
> +extern void vfio_group_set_usrdata(struct vfio_group *group, void *data);
> +extern void *vfio_group_get_usrdata(struct vfio_group *group);
> +extern void *vfio_group_get_usrdata_by_device(struct device *dev);
> +
>
> /*
> * Sub-module helpers
> diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c
> index 1dd087d..e00d401 100644
> --- a/virt/kvm/vfio.c
> +++ b/virt/kvm/vfio.c
> @@ -60,6 +60,20 @@ static void kvm_vfio_group_put_external_user(struct
> vfio_group *vfio_group)
> symbol_put(vfio_group_put_external_user);
> }
>
> +static void kvm_vfio_group_set_kvm(struct vfio_group *group, void *kvm)
> +{
> + void (*fn)(struct vfio_group *, void *);
> +
> + fn = symbol_get(vfio_group_set_usrdata);
> + if (!fn)
> + return;
> +
> + fn(group, kvm);
> + kvm_get_kvm(kvm);
> +
> + symbol_put(vfio_group_set_usrdata);
> +}
> +
> static bool kvm_vfio_group_is_coherent(struct vfio_group *vfio_group)
> {
> long (*fn)(struct vfio_group *, unsigned long);
> @@ -161,6 +175,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev,
> long attr, u64 arg)
>
> kvm_vfio_update_coherency(dev);
>
> + kvm_vfio_group_set_kvm(vfio_group, dev->kvm);
> +
> return 0;
>
> case KVM_DEV_VFIO_GROUP_DEL:
> @@ -200,6 +216,8 @@ static int kvm_vfio_set_group(struct kvm_device *dev,
> long attr, u64 arg)
>
> kvm_vfio_update_coherency(dev);
>
> + kvm_put_kvm(dev->kvm);
> +
> return ret;
> }
How does anyone get'ing the usrdata know what it contains? Does the
vendor driver compare it to a pointer it found elsewhere? How does the
vendor driver generate an error back to the user if this linkage is
necessary but unavailable? Thanks,
Alex
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Jike Song, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Paolo Bonzini, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Alex Williamson, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Alex Williamson, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Neo Jia, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Alex Williamson, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Neo Jia, 2016/10/14
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Alex Williamson, 2016/10/17
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Jike Song, 2016/10/18
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot,
Alex Williamson <=
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Jike Song, 2016/10/18
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Xiao Guangrong, 2016/10/19
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Paolo Bonzini, 2016/10/19
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Xiao Guangrong, 2016/10/19
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Paolo Bonzini, 2016/10/19
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Xiao Guangrong, 2016/10/19
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Paolo Bonzini, 2016/10/20
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Xiao, Guangrong, 2016/10/20
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Jike Song, 2016/10/20
- Re: [Qemu-devel] [PATCH 1/2] KVM: page track: add a new notifier type: track_flush_slot, Jike Song, 2016/10/26