[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver
From: |
Alex Williamson |
Subject: |
Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver |
Date: |
Tue, 21 Jun 2016 15:30:54 -0600 |
On Mon, 20 Jun 2016 22:01:46 +0530
Kirti Wankhede <address@hidden> wrote:
> Design for Mediated Device Driver:
> Main purpose of this driver is to provide a common interface for mediated
> device management that can be used by differnt drivers of different
> devices.
>
> This module provides a generic interface to create the device, add it to
> mediated bus, add device to IOMMU group and then add it to vfio group.
>
> Below is the high Level block diagram, with Nvidia, Intel and IBM devices
> as example, since these are the devices which are going to actively use
> this module as of now.
>
> +---------------+
> | |
> | +-----------+ | mdev_register_driver() +--------------+
> | | | +<------------------------+ __init() |
> | | | | | |
> | | mdev | +------------------------>+ |<-> VFIO user
> | | bus | | probe()/remove() | vfio_mpci.ko | APIs
> | | driver | | | |
> | | | | +--------------+
> | | | | mdev_register_driver() +--------------+
> | | | +<------------------------+ __init() |
> | | | | | |
> | | | +------------------------>+ |<-> VFIO user
> | +-----------+ | probe()/remove() | vfio_mccw.ko | APIs
> | | | |
> | MDEV CORE | +--------------+
> | MODULE |
> | mdev.ko |
> | +-----------+ | mdev_register_device() +--------------+
> | | | +<------------------------+ |
> | | | | | nvidia.ko |<-> physical
> | | | +------------------------>+ | device
> | | | | callback +--------------+
> | | Physical | |
> | | device | | mdev_register_device() +--------------+
> | | interface | |<------------------------+ |
> | | | | | i915.ko |<-> physical
> | | | +------------------------>+ | device
> | | | | callback +--------------+
> | | | |
> | | | | mdev_register_device() +--------------+
> | | | +<------------------------+ |
> | | | | | ccw_device.ko|<-> physical
> | | | +------------------------>+ | device
> | | | | callback +--------------+
> | +-----------+ |
> +---------------+
>
> Core driver provides two types of registration interfaces:
> 1. Registration interface for mediated bus driver:
>
> /**
> * struct mdev_driver - Mediated device's driver
> * @name: driver name
> * @probe: called when new device created
> * @remove:called when device removed
> * @match: called when new device or driver is added for this bus.
> Return 1 if given device can be handled by given driver and
> zero otherwise.
> * @driver:device driver structure
> *
> **/
> struct mdev_driver {
> const char *name;
> int (*probe) (struct device *dev);
> void (*remove) (struct device *dev);
> int (*match)(struct device *dev);
> struct device_driver driver;
> };
>
> int mdev_register_driver(struct mdev_driver *drv, struct module *owner);
> void mdev_unregister_driver(struct mdev_driver *drv);
>
> Mediated device's driver for mdev should use this interface to register
> with Core driver. With this, mediated devices driver for such devices is
> responsible to add mediated device to VFIO group.
>
> 2. Physical device driver interface
> This interface provides vendor driver the set APIs to manage physical
> device related work in their own driver. APIs are :
> - supported_config: provide supported configuration list by the vendor
> driver
> - create: to allocate basic resources in vendor driver for a mediated
> device.
> - destroy: to free resources in vendor driver when mediated device is
> destroyed.
> - start: to initiate mediated device initialization process from vendor
> driver when VM boots and before QEMU starts.
> - shutdown: to teardown mediated device resources during VM teardown.
> - read : read emulation callback.
> - write: write emulation callback.
> - set_irqs: send interrupt configuration information that QEMU sets.
> - get_region_info: to provide region size and its flags for the mediated
> device.
> - validate_map_request: to validate remap pfn request.
>
> This registration interface should be used by vendor drivers to register
> each physical device to mdev core driver.
>
> Signed-off-by: Kirti Wankhede <address@hidden>
> Signed-off-by: Neo Jia <address@hidden>
> Change-Id: I73a5084574270b14541c529461ea2f03c292d510
> ---
> drivers/vfio/Kconfig | 1 +
> drivers/vfio/Makefile | 1 +
> drivers/vfio/mdev/Kconfig | 11 +
> drivers/vfio/mdev/Makefile | 5 +
> drivers/vfio/mdev/mdev_core.c | 595
> +++++++++++++++++++++++++++++++++++++++
> drivers/vfio/mdev/mdev_driver.c | 138 +++++++++
> drivers/vfio/mdev/mdev_private.h | 33 +++
> drivers/vfio/mdev/mdev_sysfs.c | 300 ++++++++++++++++++++
> include/linux/mdev.h | 232 +++++++++++++++
> 9 files changed, 1316 insertions(+)
> create mode 100644 drivers/vfio/mdev/Kconfig
> create mode 100644 drivers/vfio/mdev/Makefile
> create mode 100644 drivers/vfio/mdev/mdev_core.c
> create mode 100644 drivers/vfio/mdev/mdev_driver.c
> create mode 100644 drivers/vfio/mdev/mdev_private.h
> create mode 100644 drivers/vfio/mdev/mdev_sysfs.c
> create mode 100644 include/linux/mdev.h
>
> diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig
> index da6e2ce77495..23eced02aaf6 100644
> --- a/drivers/vfio/Kconfig
> +++ b/drivers/vfio/Kconfig
> @@ -48,4 +48,5 @@ menuconfig VFIO_NOIOMMU
>
> source "drivers/vfio/pci/Kconfig"
> source "drivers/vfio/platform/Kconfig"
> +source "drivers/vfio/mdev/Kconfig"
> source "virt/lib/Kconfig"
> diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile
> index 7b8a31f63fea..7c70753e54ab 100644
> --- a/drivers/vfio/Makefile
> +++ b/drivers/vfio/Makefile
> @@ -7,3 +7,4 @@ obj-$(CONFIG_VFIO_IOMMU_SPAPR_TCE) += vfio_iommu_spapr_tce.o
> obj-$(CONFIG_VFIO_SPAPR_EEH) += vfio_spapr_eeh.o
> obj-$(CONFIG_VFIO_PCI) += pci/
> obj-$(CONFIG_VFIO_PLATFORM) += platform/
> +obj-$(CONFIG_MDEV) += mdev/
> diff --git a/drivers/vfio/mdev/Kconfig b/drivers/vfio/mdev/Kconfig
> new file mode 100644
> index 000000000000..951e2bb06a3f
> --- /dev/null
> +++ b/drivers/vfio/mdev/Kconfig
> @@ -0,0 +1,11 @@
> +
> +config MDEV
> + tristate "Mediated device driver framework"
> + depends on VFIO
> + default n
> + help
> + MDEV provides a framework to virtualize device without SR-IOV cap
> + See Documentation/mdev.txt for more details.
Documentation pointer still doesn't exist. Perhaps this file would be
a more appropriate place than the commit log for some of the
information above.
Every time I review this I'm struggling to figure out why this isn't
VFIO_MDEV since it's really tied to vfio and difficult to evaluate it
as some sort of standalone mediated device interface. I don't know
the answer, but it always strikes me as a discontinuity.
> +
> + If you don't know what do here, say N.
> +
> diff --git a/drivers/vfio/mdev/Makefile b/drivers/vfio/mdev/Makefile
> new file mode 100644
> index 000000000000..2c6d11f7bc24
> --- /dev/null
> +++ b/drivers/vfio/mdev/Makefile
> @@ -0,0 +1,5 @@
> +
> +mdev-y := mdev_core.o mdev_sysfs.o mdev_driver.o
> +
> +obj-$(CONFIG_MDEV) += mdev.o
> +
> diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c
> new file mode 100644
> index 000000000000..3c45ed2ae1e9
> --- /dev/null
> +++ b/drivers/vfio/mdev/mdev_core.c
> @@ -0,0 +1,595 @@
> +/*
> + * Mediated device Core Driver
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia <address@hidden>
> + * Kirti Wankhede <address@hidden>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/init.h>
> +#include <linux/module.h>
> +#include <linux/device.h>
> +#include <linux/kernel.h>
> +#include <linux/fs.h>
> +#include <linux/slab.h>
> +#include <linux/cdev.h>
> +#include <linux/sched.h>
> +#include <linux/uuid.h>
> +#include <linux/vfio.h>
> +#include <linux/iommu.h>
> +#include <linux/sysfs.h>
> +#include <linux/mdev.h>
> +
> +#include "mdev_private.h"
> +
> +#define DRIVER_VERSION "0.1"
> +#define DRIVER_AUTHOR "NVIDIA Corporation"
> +#define DRIVER_DESC "Mediated device Core Driver"
> +
> +#define MDEV_CLASS_NAME "mdev"
> +
> +static struct devices_list {
> + struct list_head dev_list;
> + struct mutex list_lock;
> +} parent_devices;
> +
I imagine this is following the example of struct vfio in vfio.c but
for this usage the following seems much easier:
static LIST_HEAD(parent_list)
static DEFINE_MUTEX(parent_list_lock);
Then you can also remove the initialization from mdev_init().
> +static int mdev_add_attribute_group(struct device *dev,
> + const struct attribute_group **groups)
> +{
> + return sysfs_create_groups(&dev->kobj, groups);
> +}
> +
> +static void mdev_remove_attribute_group(struct device *dev,
> + const struct attribute_group **groups)
> +{
> + sysfs_remove_groups(&dev->kobj, groups);
> +}
> +
> +static struct mdev_device *find_mdev_device(struct parent_device *parent,
> + uuid_le uuid, int instance)
> +{
> + struct mdev_device *mdev = NULL, *p;
> +
> + list_for_each_entry(p, &parent->mdev_list, next) {
> + if ((uuid_le_cmp(p->uuid, uuid) == 0) &&
> + (p->instance == instance)) {
> + mdev = p;
Locking here is still broken, the callers are create and destroy, which
can still race each other and themselves.
> + break;
> + }
> + }
> + return mdev;
> +}
> +
> +/* Should be called holding parent_devices.list_lock */
> +static struct parent_device *find_parent_device(struct device *dev)
> +{
> + struct parent_device *parent = NULL, *p;
> +
> + WARN_ON(!mutex_is_locked(&parent_devices.list_lock));
> + list_for_each_entry(p, &parent_devices.dev_list, next) {
> + if (p->dev == dev) {
> + parent = p;
> + break;
> + }
> + }
> + return parent;
> +}
> +
> +static void mdev_release_parent(struct kref *kref)
> +{
> + struct parent_device *parent = container_of(kref, struct parent_device,
> + ref);
> + kfree(parent);
> +}
> +
> +static
> +inline struct parent_device *mdev_get_parent(struct parent_device *parent)
> +{
> + if (parent)
> + kref_get(&parent->ref);
> +
> + return parent;
> +}
> +
> +static inline void mdev_put_parent(struct parent_device *parent)
> +{
> + if (parent)
> + kref_put(&parent->ref, mdev_release_parent);
> +}
> +
> +static struct parent_device *mdev_get_parent_by_dev(struct device *dev)
> +{
> + struct parent_device *parent = NULL, *p;
> +
> + mutex_lock(&parent_devices.list_lock);
> + list_for_each_entry(p, &parent_devices.dev_list, next) {
> + if (p->dev == dev) {
> + parent = mdev_get_parent(p);
> + break;
> + }
> + }
> + mutex_unlock(&parent_devices.list_lock);
> + return parent;
> +}
> +
> +static int mdev_device_create_ops(struct mdev_device *mdev, char
> *mdev_params)
> +{
> + struct parent_device *parent = mdev->parent;
> + int ret;
> +
> + mutex_lock(&parent->ops_lock);
> + if (parent->ops->create) {
How would a parent_device without ops->create or ops->destroy useful?
Perhaps mdev_register_driver() should enforce required ops. mdev.h
should at least document which ops are optional if they really are
optional.
> + ret = parent->ops->create(mdev->dev.parent, mdev->uuid,
> + mdev->instance, mdev_params);
> + if (ret)
> + goto create_ops_err;
> + }
> +
> + ret = mdev_add_attribute_group(&mdev->dev,
> + parent->ops->mdev_attr_groups);
An error here seems to put us in a bad place, the device is created but
the attributes are broken, is it the caller's responsibility to
destroy? Seems like we need a cleanup if this fails.
> +create_ops_err:
> + mutex_unlock(&parent->ops_lock);
It seems like ops_lock isn't used so much as a lock as a serialization
mechanism. Why? Where is this serialization per parent device
documented?
> + return ret;
> +}
> +
> +static int mdev_device_destroy_ops(struct mdev_device *mdev, bool force)
> +{
> + struct parent_device *parent = mdev->parent;
> + int ret = 0;
> +
> + /*
> + * If vendor driver doesn't return success that means vendor
> + * driver doesn't support hot-unplug
> + */
> + mutex_lock(&parent->ops_lock);
> + if (parent->ops->destroy) {
> + ret = parent->ops->destroy(parent->dev, mdev->uuid,
> + mdev->instance);
> + if (ret && !force) {
It seems this is not so much a 'force' but an ignore errors, we never
actually force the mdev driver to destroy the device... which makes me
wonder if there are leaks there.
> + ret = -EBUSY;
> + goto destroy_ops_err;
> + }
> + }
> + mdev_remove_attribute_group(&mdev->dev,
> + parent->ops->mdev_attr_groups);
> +destroy_ops_err:
> + mutex_unlock(&parent->ops_lock);
> +
> + return ret;
> +}
> +
> +static void mdev_release_device(struct kref *kref)
> +{
> + struct mdev_device *mdev = container_of(kref, struct mdev_device, ref);
> + struct parent_device *parent = mdev->parent;
> +
> + device_unregister(&mdev->dev);
> + wake_up(&parent->release_done);
> + mdev_put_parent(parent);
> +}
> +
> +struct mdev_device *mdev_get_device(struct mdev_device *mdev)
> +{
> + if (mdev)
> + kref_get(&mdev->ref);
> +
> + return mdev;
> +}
> +EXPORT_SYMBOL(mdev_get_device);
> +
> +void mdev_put_device(struct mdev_device *mdev)
> +{
> + if (mdev)
> + kref_put(&mdev->ref, mdev_release_device);
> +}
> +EXPORT_SYMBOL(mdev_put_device);
> +
> +/*
> + * Find first mediated device from given uuid and increment refcount of
> + * mediated device. Caller should call mdev_put_device() when the use of
> + * mdev_device is done.
> + */
> +static struct mdev_device *mdev_get_first_device_by_uuid(uuid_le uuid)
> +{
> + struct mdev_device *mdev = NULL, *p;
> + struct parent_device *parent;
> +
> + mutex_lock(&parent_devices.list_lock);
> + list_for_each_entry(parent, &parent_devices.dev_list, next) {
> + mutex_lock(&parent->mdev_list_lock);
This lock ordering is something we'll need to keep in mind.
> + list_for_each_entry(p, &parent->mdev_list, next) {
> + if (uuid_le_cmp(p->uuid, uuid) == 0) {
> + mdev = mdev_get_device(p);
> + break;
> + }
> + }
> + mutex_unlock(&parent->mdev_list_lock);
> +
> + if (mdev)
> + break;
> + }
> + mutex_unlock(&parent_devices.list_lock);
> + return mdev;
> +}
> +
> +/*
> + * Find mediated device from given iommu_group and increment refcount of
> + * mediated device. Caller should call mdev_put_device() when the use of
> + * mdev_device is done.
> + */
> +struct mdev_device *mdev_get_device_by_group(struct iommu_group *group)
> +{
> + struct mdev_device *mdev = NULL, *p;
> + struct parent_device *parent;
> +
> + mutex_lock(&parent_devices.list_lock);
> + list_for_each_entry(parent, &parent_devices.dev_list, next) {
> + mutex_lock(&parent->mdev_list_lock);
> + list_for_each_entry(p, &parent->mdev_list, next) {
> + if (!p->group)
> + continue;
> +
> + if (iommu_group_id(p->group) == iommu_group_id(group)) {
> + mdev = mdev_get_device(p);
> + break;
> + }
> + }
> + mutex_unlock(&parent->mdev_list_lock);
> +
> + if (mdev)
> + break;
> + }
> + mutex_unlock(&parent_devices.list_lock);
> + return mdev;
> +}
> +EXPORT_SYMBOL(mdev_get_device_by_group);
> +
> +/*
> + * mdev_register_device : Register a device
> + * @dev: device structure representing parent device.
> + * @ops: Parent device operation structure to be registered.
> + *
> + * Add device to list of registered parent devices.
> + * Returns a negative value on error, otherwise 0.
> + */
> +int mdev_register_device(struct device *dev, const struct parent_ops *ops)
> +{
> + int ret = 0;
> + struct parent_device *parent;
> +
> + if (!dev || !ops)
> + return -EINVAL;
> +
> + mutex_lock(&parent_devices.list_lock);
> +
> + /* Check for duplicate */
> + parent = find_parent_device(dev);
> + if (parent) {
> + ret = -EEXIST;
> + goto add_dev_err;
> + }
> +
> + parent = kzalloc(sizeof(*parent), GFP_KERNEL);
> + if (!parent) {
> + ret = -ENOMEM;
> + goto add_dev_err;
> + }
> +
> + kref_init(&parent->ref);
> + list_add(&parent->next, &parent_devices.dev_list);
> + mutex_unlock(&parent_devices.list_lock);
find_parent_device() matches based on parent->dev, but we're dropping
the list lock before we setup parent->dev. There are other ways to
shorten the time this lock is held, but releasing it with an incomplete
entry in the list is not the way I would choose.
> +
> + parent->dev = dev;
> + parent->ops = ops;
> + mutex_init(&parent->ops_lock);
> + mutex_init(&parent->mdev_list_lock);
> + INIT_LIST_HEAD(&parent->mdev_list);
> + init_waitqueue_head(&parent->release_done);
> +
> + ret = mdev_create_sysfs_files(dev);
> + if (ret)
> + goto add_sysfs_error;
> +
> + ret = mdev_add_attribute_group(dev, ops->dev_attr_groups);
> + if (ret)
> + goto add_group_error;
> +
> + dev_info(dev, "MDEV: Registered\n");
> + return 0;
> +
> +add_group_error:
> + mdev_remove_sysfs_files(dev);
> +add_sysfs_error:
> + mutex_lock(&parent_devices.list_lock);
> + list_del(&parent->next);
> + mutex_unlock(&parent_devices.list_lock);
> + mdev_put_parent(parent);
> + return ret;
> +
> +add_dev_err:
> + mutex_unlock(&parent_devices.list_lock);
> + return ret;
> +}
> +EXPORT_SYMBOL(mdev_register_device);
> +
> +/*
> + * mdev_unregister_device : Unregister a parent device
> + * @dev: device structure representing parent device.
> + *
> + * Remove device from list of registered parent devices. Give a chance to
> free
> + * existing mediated devices for given device.
> + */
> +
> +void mdev_unregister_device(struct device *dev)
> +{
> + struct parent_device *parent;
> + struct mdev_device *mdev, *n;
> + int ret;
> +
> + mutex_lock(&parent_devices.list_lock);
> + parent = find_parent_device(dev);
> +
> + if (!parent) {
> + mutex_unlock(&parent_devices.list_lock);
> + return;
> + }
> + dev_info(dev, "MDEV: Unregistering\n");
> +
> + /*
> + * Remove parent from the list and remove create and destroy sysfs
> + * files so that no new mediated device could be created for this parent
> + */
> + list_del(&parent->next);
> + mdev_remove_sysfs_files(dev);
> + mutex_unlock(&parent_devices.list_lock);
> +
> + mutex_lock(&parent->ops_lock);
> + mdev_remove_attribute_group(dev,
> + parent->ops->dev_attr_groups);
> + mutex_unlock(&parent->ops_lock);
> +
> + mutex_lock(&parent->mdev_list_lock);
> + list_for_each_entry_safe(mdev, n, &parent->mdev_list, next) {
> + mdev_device_destroy_ops(mdev, true);
> + list_del(&mdev->next);
> + mdev_put_device(mdev);
> + }
> + mutex_unlock(&parent->mdev_list_lock);
> +
> + do {
> + ret = wait_event_interruptible_timeout(parent->release_done,
> + list_empty(&parent->mdev_list), HZ * 10);
But we do a list_del for each mdev in mdev_list above, how could the
list not be empty here? I think you're trying to wait for all the mdev
devices to be released, but I don't think this does that. Isn't the
list empty regardless?
> + if (ret == -ERESTARTSYS) {
> + dev_warn(dev, "Mediated devices are in use, task"
> + " \"%s\" (%d) "
> + "blocked until all are released",
> + current->comm, task_pid_nr(current));
> + }
> + } while (ret <= 0);
> +
> + mdev_put_parent(parent);
> +}
> +EXPORT_SYMBOL(mdev_unregister_device);
> +
> +/*
> + * Functions required for mdev-sysfs
> + */
> +static void mdev_device_release(struct device *dev)
> +{
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + dev_dbg(&mdev->dev, "MDEV: destroying\n");
> + kfree(mdev);
> +}
> +
> +int mdev_device_create(struct device *dev, uuid_le uuid, uint32_t instance,
> + char *mdev_params)
> +{
> + int ret;
> + struct mdev_device *mdev;
> + struct parent_device *parent;
> +
> + parent = mdev_get_parent_by_dev(dev);
> + if (!parent)
> + return -EINVAL;
> +
> + /* Check for duplicate */
> + mdev = find_mdev_device(parent, uuid, instance);
But this doesn't actually prevent duplicates because we we're not
holding any lock the guarantee that another racing process doesn't
create the same {uuid,instance} between where we check and the below
list_add.
> + if (mdev) {
> + ret = -EEXIST;
> + goto create_err;
> + }
> +
> + mdev = kzalloc(sizeof(*mdev), GFP_KERNEL);
> + if (!mdev) {
> + ret = -ENOMEM;
> + goto create_err;
> + }
> +
> + memcpy(&mdev->uuid, &uuid, sizeof(uuid_le));
> + mdev->instance = instance;
> + mdev->parent = parent;
> + mutex_init(&mdev->ops_lock);
> + kref_init(&mdev->ref);
> +
> + mdev->dev.parent = dev;
> + mdev->dev.bus = &mdev_bus_type;
> + mdev->dev.release = mdev_device_release;
> + dev_set_name(&mdev->dev, "%pUb-%d", uuid.b, instance);
> +
> + ret = device_register(&mdev->dev);
> + if (ret) {
> + put_device(&mdev->dev);
> + goto create_err;
> + }
> +
> + ret = mdev_device_create_ops(mdev, mdev_params);
> + if (ret)
> + goto create_failed;
> +
> + mutex_lock(&parent->mdev_list_lock);
> + list_add(&mdev->next, &parent->mdev_list);
> + mutex_unlock(&parent->mdev_list_lock);
> +
> + dev_dbg(&mdev->dev, "MDEV: created\n");
> +
> + return ret;
> +
> +create_failed:
> + device_unregister(&mdev->dev);
> +
> +create_err:
> + mdev_put_parent(parent);
> + return ret;
> +}
> +
> +int mdev_device_destroy(struct device *dev, uuid_le uuid, uint32_t instance)
> +{
> + struct mdev_device *mdev;
> + struct parent_device *parent;
> + int ret;
> +
> + parent = mdev_get_parent_by_dev(dev);
> + if (!parent) {
> + ret = -EINVAL;
> + goto destroy_err;
> + }
> +
> + mdev = find_mdev_device(parent, uuid, instance);
> + if (!mdev) {
> + ret = -EINVAL;
> + goto destroy_err;
> + }
Likewise, without locking multiple callers can get here with the same
mdev.
> +
> + ret = mdev_device_destroy_ops(mdev, false);
> + if (ret)
> + goto destroy_err;
> +
> + mdev_put_parent(parent);
> +
> + mutex_lock(&parent->mdev_list_lock);
> + list_del(&mdev->next);
> + mutex_unlock(&parent->mdev_list_lock);
> +
> + mdev_put_device(mdev);
> + return ret;
> +
> +destroy_err:
> + mdev_put_parent(parent);
> + return ret;
> +}
> +
> +void mdev_device_supported_config(struct device *dev, char *str)
> +{
> + struct parent_device *parent;
> +
> + parent = mdev_get_parent_by_dev(dev);
> +
> + if (parent) {
> + mutex_lock(&parent->ops_lock);
> + if (parent->ops->supported_config)
> + parent->ops->supported_config(parent->dev, str);
> + mutex_unlock(&parent->ops_lock);
> + mdev_put_parent(parent);
> + }
> +}
> +
> +int mdev_device_start(uuid_le uuid)
> +{
> + int ret = 0;
> + struct mdev_device *mdev;
> + struct parent_device *parent;
> +
> + mdev = mdev_get_first_device_by_uuid(uuid);
> + if (!mdev)
> + return -EINVAL;
> +
> + parent = mdev->parent;
> +
> + mutex_lock(&parent->ops_lock);
> + if (parent->ops->start)
> + ret = parent->ops->start(mdev->uuid);
> + mutex_unlock(&parent->ops_lock);
> +
> + if (ret)
> + pr_err("mdev_start failed %d\n", ret);
> + else
> + kobject_uevent(&mdev->dev.kobj, KOBJ_ONLINE);
> +
> + mdev_put_device(mdev);
> +
> + return ret;
> +}
> +
> +int mdev_device_shutdown(uuid_le uuid)
> +{
> + int ret = 0;
> + struct mdev_device *mdev;
> + struct parent_device *parent;
> +
> + mdev = mdev_get_first_device_by_uuid(uuid);
> + if (!mdev)
> + return -EINVAL;
> +
> + parent = mdev->parent;
> +
> + mutex_lock(&parent->ops_lock);
> + if (parent->ops->shutdown)
> + ret = parent->ops->shutdown(mdev->uuid);
> + mutex_unlock(&parent->ops_lock);
> +
> + if (ret)
> + pr_err("mdev_shutdown failed %d\n", ret);
> + else
> + kobject_uevent(&mdev->dev.kobj, KOBJ_OFFLINE);
> +
> + mdev_put_device(mdev);
> + return ret;
> +}
> +
> +static struct class mdev_class = {
> + .name = MDEV_CLASS_NAME,
> + .owner = THIS_MODULE,
> + .class_attrs = mdev_class_attrs,
> +};
> +
> +static int __init mdev_init(void)
> +{
> + int ret;
> +
> + mutex_init(&parent_devices.list_lock);
> + INIT_LIST_HEAD(&parent_devices.dev_list);
> +
> + ret = class_register(&mdev_class);
> + if (ret) {
> + pr_err("Failed to register mdev class\n");
> + return ret;
> + }
> +
> + ret = mdev_bus_register();
> + if (ret) {
> + pr_err("Failed to register mdev bus\n");
> + class_unregister(&mdev_class);
> + return ret;
> + }
> +
> + return ret;
> +}
> +
> +static void __exit mdev_exit(void)
> +{
> + mdev_bus_unregister();
> + class_unregister(&mdev_class);
> +}
> +
> +module_init(mdev_init)
> +module_exit(mdev_exit)
> +
> +MODULE_VERSION(DRIVER_VERSION);
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR(DRIVER_AUTHOR);
> +MODULE_DESCRIPTION(DRIVER_DESC);
> diff --git a/drivers/vfio/mdev/mdev_driver.c b/drivers/vfio/mdev/mdev_driver.c
> new file mode 100644
> index 000000000000..f1aed541111d
> --- /dev/null
> +++ b/drivers/vfio/mdev/mdev_driver.c
> @@ -0,0 +1,138 @@
> +/*
> + * MDEV driver
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia <address@hidden>
> + * Kirti Wankhede <address@hidden>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/device.h>
> +#include <linux/iommu.h>
> +#include <linux/mdev.h>
> +
> +#include "mdev_private.h"
> +
> +static int mdev_attach_iommu(struct mdev_device *mdev)
> +{
> + int ret;
> + struct iommu_group *group;
> +
> + group = iommu_group_alloc();
> + if (IS_ERR(group)) {
> + dev_err(&mdev->dev, "MDEV: failed to allocate group!\n");
> + return PTR_ERR(group);
> + }
> +
> + ret = iommu_group_add_device(group, &mdev->dev);
> + if (ret) {
> + dev_err(&mdev->dev, "MDEV: failed to add dev to group!\n");
> + goto attach_fail;
> + }
> +
> + mdev->group = group;
> +
> + dev_info(&mdev->dev, "MDEV: group_id = %d\n",
> + iommu_group_id(group));
> +attach_fail:
> + iommu_group_put(group);
> + return ret;
> +}
> +
> +static void mdev_detach_iommu(struct mdev_device *mdev)
> +{
> + iommu_group_remove_device(&mdev->dev);
mdev->group = NULL; seems prudent
> + dev_info(&mdev->dev, "MDEV: detaching iommu\n");
> +}
> +
> +static int mdev_probe(struct device *dev)
> +{
> + struct mdev_driver *drv = to_mdev_driver(dev->driver);
> + struct mdev_device *mdev = to_mdev_device(dev);
> + int ret;
> +
> + ret = mdev_attach_iommu(mdev);
> + if (ret) {
> + dev_err(dev, "Failed to attach IOMMU\n");
> + return ret;
> + }
> +
> + if (drv && drv->probe)
> + ret = drv->probe(dev);
> +
> + return ret;
> +}
> +
> +static int mdev_remove(struct device *dev)
> +{
> + struct mdev_driver *drv = to_mdev_driver(dev->driver);
> + struct mdev_device *mdev = to_mdev_device(dev);
> +
> + if (drv && drv->remove)
> + drv->remove(dev);
> +
> + mdev_detach_iommu(mdev);
> +
> + return 0;
> +}
> +
> +static int mdev_match(struct device *dev, struct device_driver *drv)
> +{
> + struct mdev_driver *mdrv = to_mdev_driver(drv);
nit, drv above, mdrv here
> +
> + if (mdrv && mdrv->match)
> + return mdrv->match(dev);
> +
> + return 0;
> +}
> +
> +struct bus_type mdev_bus_type = {
> + .name = "mdev",
> + .match = mdev_match,
> + .probe = mdev_probe,
> + .remove = mdev_remove,
> +};
> +EXPORT_SYMBOL_GPL(mdev_bus_type);
> +
> +/*
> + * mdev_register_driver - register a new MDEV driver
> + * @drv: the driver to register
> + * @owner: module owner of driver to be registered
> + *
> + * Returns a negative value on error, otherwise 0.
> + */
> +int mdev_register_driver(struct mdev_driver *drv, struct module *owner)
> +{
> + /* initialize common driver fields */
> + drv->driver.name = drv->name;
> + drv->driver.bus = &mdev_bus_type;
> + drv->driver.owner = owner;
> +
> + /* register with core */
> + return driver_register(&drv->driver);
> +}
> +EXPORT_SYMBOL(mdev_register_driver);
> +
> +/*
> + * mdev_unregister_driver - unregister MDEV driver
> + * @drv: the driver to unregister
> + *
> + */
> +void mdev_unregister_driver(struct mdev_driver *drv)
> +{
> + driver_unregister(&drv->driver);
> +}
> +EXPORT_SYMBOL(mdev_unregister_driver);
> +
> +int mdev_bus_register(void)
> +{
> + return bus_register(&mdev_bus_type);
> +}
> +
> +void mdev_bus_unregister(void)
> +{
> + bus_unregister(&mdev_bus_type);
> +}
> diff --git a/drivers/vfio/mdev/mdev_private.h
> b/drivers/vfio/mdev/mdev_private.h
> new file mode 100644
> index 000000000000..991d7f796169
> --- /dev/null
> +++ b/drivers/vfio/mdev/mdev_private.h
> @@ -0,0 +1,33 @@
> +/*
> + * Mediated device interal definitions
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia <address@hidden>
> + * Kirti Wankhede <address@hidden>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef MDEV_PRIVATE_H
> +#define MDEV_PRIVATE_H
> +
> +int mdev_bus_register(void);
> +void mdev_bus_unregister(void);
> +
> +/* Function prototypes for mdev_sysfs */
> +
> +extern struct class_attribute mdev_class_attrs[];
> +
> +int mdev_create_sysfs_files(struct device *dev);
> +void mdev_remove_sysfs_files(struct device *dev);
> +
> +int mdev_device_create(struct device *dev, uuid_le uuid, uint32_t instance,
> + char *mdev_params);
> +int mdev_device_destroy(struct device *dev, uuid_le uuid, uint32_t
> instance);
> +void mdev_device_supported_config(struct device *dev, char *str);
> +int mdev_device_start(uuid_le uuid);
> +int mdev_device_shutdown(uuid_le uuid);
nit, stop is start as startup is to shutdown. IOW, should this be
mdev_device_stop()?
> +
> +#endif /* MDEV_PRIVATE_H */
> diff --git a/drivers/vfio/mdev/mdev_sysfs.c b/drivers/vfio/mdev/mdev_sysfs.c
> new file mode 100644
> index 000000000000..48b66e40009e
> --- /dev/null
> +++ b/drivers/vfio/mdev/mdev_sysfs.c
> @@ -0,0 +1,300 @@
> +/*
> + * File attributes for Mediated devices
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia <address@hidden>
> + * Kirti Wankhede <address@hidden>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/sysfs.h>
> +#include <linux/ctype.h>
> +#include <linux/device.h>
> +#include <linux/slab.h>
> +#include <linux/uuid.h>
> +#include <linux/mdev.h>
> +
> +#include "mdev_private.h"
> +
> +/* Prototypes */
> +static ssize_t mdev_supported_types_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf);
> +static DEVICE_ATTR_RO(mdev_supported_types);
> +
> +static ssize_t mdev_create_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count);
> +static DEVICE_ATTR_WO(mdev_create);
> +
> +static ssize_t mdev_destroy_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count);
> +static DEVICE_ATTR_WO(mdev_destroy);
> +
> +/* Static functions */
> +
> +#define UUID_CHAR_LENGTH 36
> +#define UUID_BYTE_LENGTH 16
> +
> +#define SUPPORTED_TYPE_BUFFER_LENGTH 1024
> +
> +static inline bool is_uuid_sep(char sep)
> +{
> + if (sep == '\n' || sep == '-' || sep == ':' || sep == '\0')
> + return true;
> + return false;
> +}
> +
> +static int uuid_parse(const char *str, uuid_le *uuid)
> +{
> + int i;
> +
> + if (strlen(str) < UUID_CHAR_LENGTH)
> + return -EINVAL;
> +
> + for (i = 0; i < UUID_BYTE_LENGTH; i++) {
> + if (!isxdigit(str[0]) || !isxdigit(str[1])) {
> + pr_err("%s err", __func__);
> + return -EINVAL;
> + }
> +
> + uuid->b[i] = (hex_to_bin(str[0]) << 4) | hex_to_bin(str[1]);
> + str += 2;
> + if (is_uuid_sep(*str))
> + str++;
> + }
> +
> + return 0;
> +}
> +
> +/* mdev sysfs Functions */
> +static ssize_t mdev_supported_types_show(struct device *dev,
> + struct device_attribute *attr,
> + char *buf)
> +{
> + char *str, *ptr;
> + ssize_t n;
> +
> + str = kzalloc(sizeof(*str) * SUPPORTED_TYPE_BUFFER_LENGTH, GFP_KERNEL);
> + if (!str)
> + return -ENOMEM;
> +
> + ptr = str;
> + mdev_device_supported_config(dev, str);
> +
> + n = sprintf(buf, "%s\n", str);
> + kfree(ptr);
> +
> + return n;
> +}
> +
> +static ssize_t mdev_create_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + char *str, *pstr;
> + char *uuid_str, *instance_str, *mdev_params = NULL, *params = NULL;
> + uuid_le uuid;
> + uint32_t instance;
> + int ret;
> +
> + pstr = str = kstrndup(buf, count, GFP_KERNEL);
> +
> + if (!str)
> + return -ENOMEM;
> +
> + uuid_str = strsep(&str, ":");
> + if (!uuid_str) {
> + pr_err("mdev_create: Empty UUID string %s\n", buf);
> + ret = -EINVAL;
> + goto create_error;
> + }
> +
> + if (!str) {
> + pr_err("mdev_create: mdev instance not present %s\n", buf);
> + ret = -EINVAL;
> + goto create_error;
> + }
> +
> + instance_str = strsep(&str, ":");
> + if (!instance_str) {
> + pr_err("mdev_create: Empty instance string %s\n", buf);
> + ret = -EINVAL;
> + goto create_error;
> + }
> +
> + ret = kstrtouint(instance_str, 0, &instance);
> + if (ret) {
> + pr_err("mdev_create: mdev instance parsing error %s\n", buf);
> + goto create_error;
> + }
> +
> + if (str)
> + params = mdev_params = kstrdup(str, GFP_KERNEL);
> +
> + ret = uuid_parse(uuid_str, &uuid);
> + if (ret) {
> + pr_err("mdev_create: UUID parse error %s\n", buf);
> + goto create_error;
> + }
> +
> + ret = mdev_device_create(dev, uuid, instance, mdev_params);
> + if (ret)
> + pr_err("mdev_create: Failed to create mdev device\n");
> + else
> + ret = count;
> +
> +create_error:
> + kfree(params);
> + kfree(pstr);
> + return ret;
> +}
> +
> +static ssize_t mdev_destroy_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t count)
> +{
> + char *uuid_str, *str, *pstr;
> + uuid_le uuid;
> + unsigned int instance;
> + int ret;
> +
> + str = pstr = kstrndup(buf, count, GFP_KERNEL);
> +
> + if (!str)
> + return -ENOMEM;
> +
> + uuid_str = strsep(&str, ":");
> + if (!uuid_str) {
> + pr_err("mdev_destroy: Empty UUID string %s\n", buf);
> + ret = -EINVAL;
> + goto destroy_error;
> + }
> +
> + if (str == NULL) {
> + pr_err("mdev_destroy: instance not specified %s\n", buf);
> + ret = -EINVAL;
> + goto destroy_error;
> + }
> +
> + ret = kstrtouint(str, 0, &instance);
> + if (ret) {
> + pr_err("mdev_destroy: instance parsing error %s\n", buf);
> + goto destroy_error;
> + }
> +
> + ret = uuid_parse(uuid_str, &uuid);
> + if (ret) {
> + pr_err("mdev_destroy: UUID parse error %s\n", buf);
> + goto destroy_error;
> + }
> +
> + ret = mdev_device_destroy(dev, uuid, instance);
> + if (ret == 0)
> + ret = count;
> +
> +destroy_error:
> + kfree(pstr);
> + return ret;
> +}
> +
> +ssize_t mdev_start_store(struct class *class, struct class_attribute *attr,
> + const char *buf, size_t count)
> +{
> + char *uuid_str, *ptr;
> + uuid_le uuid;
> + int ret;
> +
> + ptr = uuid_str = kstrndup(buf, count, GFP_KERNEL);
> +
> + if (!uuid_str)
> + return -ENOMEM;
> +
> + ret = uuid_parse(uuid_str, &uuid);
> + if (ret) {
> + pr_err("mdev_start: UUID parse error %s\n", buf);
> + goto start_error;
> + }
> +
> + ret = mdev_device_start(uuid);
> + if (ret == 0)
> + ret = count;
> +
> +start_error:
> + kfree(ptr);
> + return ret;
> +}
> +
> +ssize_t mdev_shutdown_store(struct class *class, struct class_attribute
> *attr,
> + const char *buf, size_t count)
> +{
> + char *uuid_str, *ptr;
> + uuid_le uuid;
> + int ret;
> +
> + ptr = uuid_str = kstrndup(buf, count, GFP_KERNEL);
> +
> + if (!uuid_str)
> + return -ENOMEM;
> +
> + ret = uuid_parse(uuid_str, &uuid);
> + if (ret) {
> + pr_err("mdev_shutdown: UUID parse error %s\n", buf);
> + goto shutdown_error;
> + }
> +
> + ret = mdev_device_shutdown(uuid);
> + if (ret == 0)
> + ret = count;
> +
> +shutdown_error:
> + kfree(ptr);
> + return ret;
> +
> +}
> +
> +struct class_attribute mdev_class_attrs[] = {
> + __ATTR_WO(mdev_start),
> + __ATTR_WO(mdev_shutdown),
> + __ATTR_NULL
> +};
> +
> +int mdev_create_sysfs_files(struct device *dev)
> +{
> + int ret;
> +
> + ret = sysfs_create_file(&dev->kobj,
> + &dev_attr_mdev_supported_types.attr);
> + if (ret) {
> + pr_err("Failed to create mdev_supported_types sysfs entry\n");
> + return ret;
> + }
> +
> + ret = sysfs_create_file(&dev->kobj, &dev_attr_mdev_create.attr);
> + if (ret) {
> + pr_err("Failed to create mdev_create sysfs entry\n");
> + goto create_sysfs_failed;
> + }
> +
> + ret = sysfs_create_file(&dev->kobj, &dev_attr_mdev_destroy.attr);
> + if (ret) {
> + pr_err("Failed to create mdev_destroy sysfs entry\n");
> + sysfs_remove_file(&dev->kobj, &dev_attr_mdev_create.attr);
> + } else
> + return ret;
> +
> +create_sysfs_failed:
> + sysfs_remove_file(&dev->kobj, &dev_attr_mdev_supported_types.attr);
> + return ret;
> +}
> +
> +void mdev_remove_sysfs_files(struct device *dev)
> +{
> + sysfs_remove_file(&dev->kobj, &dev_attr_mdev_supported_types.attr);
> + sysfs_remove_file(&dev->kobj, &dev_attr_mdev_create.attr);
> + sysfs_remove_file(&dev->kobj, &dev_attr_mdev_destroy.attr);
> +}
> diff --git a/include/linux/mdev.h b/include/linux/mdev.h
> new file mode 100644
> index 000000000000..31b6f8572cfa
> --- /dev/null
> +++ b/include/linux/mdev.h
> @@ -0,0 +1,232 @@
> +/*
> + * Mediated device definition
> + *
> + * Copyright (c) 2016, NVIDIA CORPORATION. All rights reserved.
> + * Author: Neo Jia <address@hidden>
> + * Kirti Wankhede <address@hidden>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#ifndef MDEV_H
> +#define MDEV_H
> +
> +/* Common Data structures */
> +
> +struct pci_region_info {
> + uint64_t start;
> + uint64_t size;
> + uint32_t flags; /* VFIO region info flags */
> +};
> +
> +enum mdev_emul_space {
> + EMUL_CONFIG_SPACE, /* PCI configuration space */
> + EMUL_IO, /* I/O register space */
> + EMUL_MMIO /* Memory-mapped I/O space */
> +};
I'm still confused why this is needed, perhaps a description here would
be useful so I can stop asking. Clearly config space is PCI only, so
it's strange to have it in the common code. Everyone not on x86 will
say I/O space is also strange. I can't keep it in my head why the
read/write offsets aren't sufficient for the driver to figure out what
type it is.
> +
> +struct parent_device;
> +
> +/*
> + * Mediated device
> + */
> +
> +struct mdev_device {
> + struct device dev;
> + struct parent_device *parent;
> + struct iommu_group *group;
> + void *iommu_data;
> + uuid_le uuid;
> + uint32_t instance;
> +
> + /* internal only */
> + struct kref ref;
> + struct mutex ops_lock;
> + struct list_head next;
> +};
> +
> +
> +/**
> + * struct parent_ops - Structure to be registered for each parent device to
> + * register the device to mdev module.
> + *
> + * @owner: The module owner.
> + * @dev_attr_groups: Default attributes of the parent device.
> + * @mdev_attr_groups: Default attributes of the mediated device.
> + * @supported_config: Called to get information about supported types.
> + * @dev : device structure of parent device.
> + * @config: should return string listing supported config
> + * Returns integer: success (0) or error (< 0)
> + * @create: Called to allocate basic resources in parent device's
> + * driver for a particular mediated device
> + * @dev: parent device structure on which mediated device
> + * should be created
> + * @uuid: VM's uuid for which VM it is intended to
> + * @instance: mediated instance in that VM
> + * @mdev_params: extra parameters required by parent
> + * device's driver.
> + * Returns integer: success (0) or error (< 0)
> + * @destroy: Called to free resources in parent device's driver for a
> + * a mediated device instance of that VM.
> + * @dev: parent device structure to which this mediated
> + * device points to.
> + * @uuid: VM's uuid for which the mediated device belongs
> + * @instance: mdev instance in that VM
> + * Returns integer: success (0) or error (< 0)
> + * If VM is running and destroy() is called that means the
> + * mdev is being hotunpluged. Return error if VM is running
> + * and driver doesn't support mediated device hotplug.
> + * @start: Called to initiate mediated device initialization
> + * process in parent device's driver when VM boots before
> + * VMM starts
> + * @uuid: VM's UUID which is booting.
> + * Returns integer: success (0) or error (< 0)
> + * @shutdown: Called to teardown mediated device related
> resources for
> + * the VM
> + * @uuid: VM's UUID which is shutting down .
> + * Returns integer: success (0) or error (< 0)
> + * @read: Read emulation callback
> + * @mdev: mediated device structure
> + * @buf: read buffer
> + * @count: number of bytes to read
> + * @address_space: specifies for which address space the
> + * request is intended for - pci_config_space, IO register
> + * space or MMIO space.
> + * @addr: address.
> + * Retuns number on bytes read on success or error.
> + * @write: Write emulation callback
> + * @mdev: mediated device structure
> + * @buf: write buffer
> + * @count: number of bytes to be written
> + * @address_space: specifies for which address space the
> + * request is intended for - pci_config_space, IO register
> + * space or MMIO space.
> + * @addr: address.
> + * Retuns number on bytes written on success or error.
> + * @set_irqs: Called to send about interrupts configuration
> + * information that VMM sets.
> + * @mdev: mediated device structure
> + * @flags, index, start, count and *data : same as that of
> + * struct vfio_irq_set of VFIO_DEVICE_SET_IRQS API.
> + * @get_region_info: Called to get VFIO region size and flags of mediated
> + * device.
> + * @mdev: mediated device structure
> + * @region_index: VFIO region index
> + * @region_info: output, returns size and flags of
> + * requested region.
> + * Returns integer: success (0) or error (< 0)
> + * @validate_map_request: Validate remap pfn request
> + * @mdev: mediated device structure
> + * @virtaddr: target user address to start at
> + * @pfn: parent address of kernel memory, vendor driver
> + * can change if required.
> + * @size: size of map area, vendor driver can change the
> + * size of map area if desired.
> + * @prot: page protection flags for this mapping, vendor
> + * driver can change, if required.
> + * Returns integer: success (0) or error (< 0)
> + *
> + * Parent device that support mediated device should be registered with mdev
> + * module with parent_ops structure.
> + */
> +
> +struct parent_ops {
> + struct module *owner;
> + const struct attribute_group **dev_attr_groups;
> + const struct attribute_group **mdev_attr_groups;
> +
> + int (*supported_config)(struct device *dev, char *config);
> + int (*create)(struct device *dev, uuid_le uuid,
> + uint32_t instance, char *mdev_params);
> + int (*destroy)(struct device *dev, uuid_le uuid,
> + uint32_t instance);
> + int (*start)(uuid_le uuid);
> + int (*shutdown)(uuid_le uuid);
> + ssize_t (*read)(struct mdev_device *vdev, char *buf, size_t count,
> + enum mdev_emul_space address_space, loff_t pos);
> + ssize_t (*write)(struct mdev_device *vdev, char *buf, size_t count,
> + enum mdev_emul_space address_space, loff_t pos);
> + int (*set_irqs)(struct mdev_device *vdev, uint32_t flags,
> + unsigned int index, unsigned int start,
> + unsigned int count, void *data);
> + int (*get_region_info)(struct mdev_device *vdev, int region_index,
> + struct pci_region_info *region_info);
This can't be //pci_//region_info. How do you intend to support things
like sparse mmap capabilities in the user REGION_INFO ioctl when such
things are not part of the mediated device API? Seems like the driver
should just return a buffer.
> + int (*validate_map_request)(struct mdev_device *vdev,
> + unsigned long virtaddr,
> + unsigned long *pfn, unsigned long *size,
> + pgprot_t *prot);
> +};
> +
> +/*
> + * Parent Device
> + */
> +struct parent_device {
> + struct device *dev;
> + const struct parent_ops *ops;
> +
> + /* internal */
> + struct kref ref;
> + struct mutex ops_lock;
> + struct list_head next;
> + struct list_head mdev_list;
> + struct mutex mdev_list_lock;
> + wait_queue_head_t release_done;
> +};
> +
> +/**
> + * struct mdev_driver - Mediated device driver
> + * @name: driver name
> + * @probe: called when new device created
> + * @remove: called when device removed
> + * @match: called when new device or driver is added for this bus. Return 1
> if
> + * given device can be handled by given driver and zero otherwise.
> + * @driver: device driver structure
> + *
> + **/
> +struct mdev_driver {
> + const char *name;
> + int (*probe)(struct device *dev);
> + void (*remove)(struct device *dev);
> + int (*match)(struct device *dev);
> + struct device_driver driver;
> +};
> +
> +static inline struct mdev_driver *to_mdev_driver(struct device_driver *drv)
> +{
> + return drv ? container_of(drv, struct mdev_driver, driver) : NULL;
> +}
> +
> +static inline struct mdev_device *to_mdev_device(struct device *dev)
> +{
> + return dev ? container_of(dev, struct mdev_device, dev) : NULL;
> +}
> +
> +static inline void *mdev_get_drvdata(struct mdev_device *mdev)
> +{
> + return dev_get_drvdata(&mdev->dev);
> +}
> +
> +static inline void mdev_set_drvdata(struct mdev_device *mdev, void *data)
> +{
> + dev_set_drvdata(&mdev->dev, data);
> +}
> +
> +extern struct bus_type mdev_bus_type;
> +
> +#define dev_is_mdev(d) ((d)->bus == &mdev_bus_type)
> +
> +extern int mdev_register_device(struct device *dev,
> + const struct parent_ops *ops);
> +extern void mdev_unregister_device(struct device *dev);
> +
> +extern int mdev_register_driver(struct mdev_driver *drv, struct module
> *owner);
> +extern void mdev_unregister_driver(struct mdev_driver *drv);
> +
> +extern struct mdev_device *mdev_get_device(struct mdev_device *mdev);
> +extern void mdev_put_device(struct mdev_device *mdev);
> +
> +extern struct mdev_device *mdev_get_device_by_group(struct iommu_group
> *group);
> +
> +#endif /* MDEV_H */
- [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, (continued)
- [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Kirti Wankhede, 2016/06/20
- Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Alex Williamson, 2016/06/21
- Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Kirti Wankhede, 2016/06/24
- Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Alex Williamson, 2016/06/24
- Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Kirti Wankhede, 2016/06/28
- Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Alex Williamson, 2016/06/28
- Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Kirti Wankhede, 2016/06/30
Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Xiao Guangrong, 2016/06/30
[Qemu-devel] [PATCH 1/3] Mediated device Core driver, Kirti Wankhede, 2016/06/20
Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver, Xiao Guangrong, 2016/06/29
Re: [Qemu-devel] [PATCH 1/3] Mediated device Core driver, Kirti Wankhede, 2016/06/30
[Qemu-devel] [PATCH 3/3] VFIO Type1 IOMMU: Add support for mediated devices, Kirti Wankhede, 2016/06/20