[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC PATCH] vfio: Add sysfsdev property for pci & platf
From: |
Eric Auger |
Subject: |
Re: [Qemu-devel] [RFC PATCH] vfio: Add sysfsdev property for pci & platform |
Date: |
Tue, 26 Jan 2016 16:03:28 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 |
Hi Alex,
I did a try with both legacy cmd line and new one and it works fine for
vfio platform too:
-device vfio-calxeda-xgmac,host="fff51000.ethernet"
-device
vfio-calxeda-xgmac,sysfsdev="/sys/bus/platform/devices/fff51000.ethernet"
Tested-by: Eric Auger <address@hidden>
Reviewed-by: Eric Auger <address@hidden>
just 1 question below.
Best Regards
Eric
On 01/20/2016 07:06 PM, Alex Williamson wrote:
> vfio-pci currently requires a host= parameter, which comes in the
> form of a PCI address in [domain:]<bus:slot.function> notation. We
> expect to find a matching entry in sysfs for that under
> /sys/bus/pci/devices/. vfio-platform takes a similar approach, but
> defines the host= parameter to be a string, which can be matched
> directly under /sys/bus/platform/devices/. On the PCI side, we have
> some interest in using vfio to expose vGPU devices. These are not
> actual discrete PCI devices, so they don't have a compatible host PCI
> bus address or a device link where QEMU wants to look for it. There's
> also really no requirement that vfio can only be used to expose
> physical devices, a new vfio bus and iommu driver could expose a
> completely emulated device. To fit within the vfio framework, it
> would need a kernel struct device and associated IOMMU group, but
> those are easy constraints to manage.
>
> To support such devices, which would include vGPUs, that honor the
> VFIO PCI programming API, but are not necessarily backed by a unique
> PCI address, add support for specifying any device in sysfs. The
> vfio API already has support for probing the device type to ensure
> compatibility with either vfio-pci or vfio-platform.
>
> With this, a vfio-pci device could either be specified as:
>
> -device vfio-pci,host=02:00.0
>
> or
>
> -device vfio-pci,sysfsdev=/sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0
>
> or even
>
> -device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:02:00.0
>
> When vGPU support comes along, this might look something more like:
>
> -device
> vfio-pci,sysfsdev=/sys/devices/virtual/intel-vgpu/address@hidden:00:02.0
>
> NB - This is only a made up example path, but it should be noted that
> the device namespace is global for vfio, a virtual device cannot
> overlap with existing namespaces and should not create a name prone to
> conflict, such as a simple instance number.
>
> The same changes is made for vfio-platform but is only compile tested.
> In both cases, specifying sysfsdev has precedence over the old host
> option.
>
> Signed-off-by: Alex Williamson <address@hidden>
> ---
> hw/vfio/pci.c | 130
> +++++++++++++++++------------------------
> hw/vfio/platform.c | 55 ++++++++++-------
> include/hw/vfio/vfio-common.h | 1
> 3 files changed, 86 insertions(+), 100 deletions(-)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 1fb868c..bfe4215 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -880,12 +880,8 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> if (vdev->pdev.romfile || !vdev->pdev.rom_bar) {
> /* Since pci handles romfile, just print a message and return */
> if (vfio_blacklist_opt_rom(vdev) && vdev->pdev.romfile) {
> - error_printf("Warning : Device at %04x:%02x:%02x.%x "
> - "is known to cause system instability issues during
> "
> - "option rom execution. "
> - "Proceeding anyway since user specified romfile\n",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + error_printf("Warning : Device at %s is known to cause system
> instability issues during option rom execution. Proceeding anyway since user
> specified romfile\n",
> + vdev->vbasedev.name);
> }
> return;
> }
> @@ -898,9 +894,7 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
> pwrite(fd, &size, 4, offset) != 4 ||
> pread(fd, &size, 4, offset) != 4 ||
> pwrite(fd, &orig, 4, offset) != 4) {
> - error_report("%s(%04x:%02x:%02x.%x) failed: %m",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function);
> + error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name);
> return;
> }
>
> @@ -912,29 +906,18 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
>
> if (vfio_blacklist_opt_rom(vdev)) {
> if (dev->opts && qemu_opt_get(dev->opts, "rombar")) {
> - error_printf("Warning : Device at %04x:%02x:%02x.%x "
> - "is known to cause system instability issues during
> "
> - "option rom execution. "
> - "Proceeding anyway since user specified non zero
> value for "
> - "rombar\n",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + error_printf("Warning : Device at %s is known to cause system
> instability issues during option rom execution. Proceeding anyway since user
> specified non zero value for rombar\n",
> + vdev->vbasedev.name);
> } else {
> - error_printf("Warning : Rom loading for device at "
> - "%04x:%02x:%02x.%x has been disabled due to "
> - "system instability issues. "
> - "Specify rombar=1 or romfile to force\n",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + error_printf("Warning : Rom loading for device at %s has been
> disabled due to system instability issues. Specify rombar=1 or romfile to
> force\n",
> + vdev->vbasedev.name);
> return;
> }
> }
>
> trace_vfio_pci_size_rom(vdev->vbasedev.name, size);
>
> - snprintf(name, sizeof(name), "vfio[%04x:%02x:%02x.%x].rom",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> + snprintf(name, sizeof(name), "vfio[%s].rom", vdev->vbasedev.name);
>
> memory_region_init_io(&vdev->pdev.rom, OBJECT(vdev),
> &vfio_rom_ops, vdev, name, size);
> @@ -1048,9 +1031,8 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t
> addr, int len)
> ret = pread(vdev->vbasedev.fd, &phys_val, len,
> vdev->config_offset + addr);
> if (ret != len) {
> - error_report("%s(%04x:%02x:%02x.%x, 0x%x, 0x%x) failed: %m",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function, addr, len);
> + error_report("%s(%s, 0x%x, 0x%x) failed: %m",
> + __func__, vdev->vbasedev.name, addr, len);
> return -errno;
> }
> phys_val = le32_to_cpu(phys_val);
> @@ -1074,9 +1056,8 @@ void vfio_pci_write_config(PCIDevice *pdev,
> /* Write everything to VFIO, let it filter out what we can't write */
> if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr)
> != len) {
> - error_report("%s(%04x:%02x:%02x.%x, 0x%x, 0x%x, 0x%x) failed: %m",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function, addr, val, len);
> + error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %m",
> + __func__, vdev->vbasedev.name, addr, val, len);
> }
>
> /* MSI/MSI-X Enabling/Disabling */
> @@ -1347,9 +1328,7 @@ static void vfio_map_bar(VFIOPCIDevice *vdev, int nr)
> return;
> }
>
> - snprintf(name, sizeof(name), "VFIO %04x:%02x:%02x.%x BAR %d",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function, nr);
> + snprintf(name, sizeof(name), "VFIO %s BAR %d", vdev->vbasedev.name, nr);
>
> /* Determine what type of BAR this is for registration */
> ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar),
> @@ -1719,9 +1698,8 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev,
> uint8_t pos)
> }
>
> if (ret < 0) {
> - error_report("vfio: %04x:%02x:%02x.%x Error adding PCI capability "
> - "address@hidden: %d", vdev->host.domain,
> - vdev->host.bus, vdev->host.slot, vdev->host.function,
> + error_report("vfio: %s Error adding PCI capability "
> + "address@hidden: %d", vdev->vbasedev.name,
> cap_id, size, pos, ret);
> return ret;
> }
> @@ -1783,11 +1761,14 @@ static void vfio_pci_post_reset(VFIOPCIDevice *vdev)
> vfio_intx_enable(vdev);
> }
>
> -static bool vfio_pci_host_match(PCIHostDeviceAddress *host1,
> - PCIHostDeviceAddress *host2)
> +static bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name)
> {
> - return (host1->domain == host2->domain && host1->bus == host2->bus &&
> - host1->slot == host2->slot && host1->function ==
> host2->function);
> + char tmp[13];
> +
> + sprintf(tmp, "%04x:%02x:%02x.%1x", addr->domain,
> + addr->bus, addr->slot, addr->function);
> +
> + return (strcmp(tmp, name) == 0);
> }
>
> static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single)
> @@ -1812,9 +1793,8 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool
> single)
> if (ret && errno != ENOSPC) {
> ret = -errno;
> if (!vdev->has_pm_reset) {
> - error_report("vfio: Cannot reset device %04x:%02x:%02x.%x, "
> - "no available reset mechanism.", vdev->host.domain,
> - vdev->host.bus, vdev->host.slot,
> vdev->host.function);
> + error_report("vfio: Cannot reset device %s, "
> + "no available reset mechanism.",
> vdev->vbasedev.name);
> }
> goto out_single;
> }
> @@ -1847,7 +1827,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool
> single)
> trace_vfio_pci_hot_reset_dep_devices(host.domain,
> host.bus, host.slot, host.function, devices[i].group_id);
>
> - if (vfio_pci_host_match(&host, &vdev->host)) {
> + if (vfio_pci_host_match(&host, vdev->vbasedev.name)) {
> continue;
> }
>
> @@ -1873,7 +1853,7 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool
> single)
> continue;
> }
> tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
> - if (vfio_pci_host_match(&host, &tmp->host)) {
> + if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
> if (single) {
> ret = -EINVAL;
> goto out_single;
> @@ -1935,7 +1915,7 @@ out:
> host.slot = PCI_SLOT(devices[i].devfn);
> host.function = PCI_FUNC(devices[i].devfn);
>
> - if (vfio_pci_host_match(&host, &vdev->host)) {
> + if (vfio_pci_host_match(&host, vdev->vbasedev.name)) {
> continue;
> }
>
> @@ -1954,7 +1934,7 @@ out:
> continue;
> }
> tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev);
> - if (vfio_pci_host_match(&host, &tmp->host)) {
> + if (vfio_pci_host_match(&host, tmp->vbasedev.name)) {
> vfio_pci_post_reset(tmp);
> break;
> }
> @@ -2160,10 +2140,7 @@ static void vfio_err_notifier_handler(void *opaque)
> * guest to contain the error.
> */
>
> - error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected. "
> - "Please collect any data possible and then kill the guest",
> - __func__, vdev->host.domain, vdev->host.bus,
> - vdev->host.slot, vdev->host.function);
> + error_report("%s(%s) Unrecoverable error detected. Please collect any
> data possible and then kill the guest", __func__, vdev->vbasedev.name);
>
> vm_stop(RUN_STATE_INTERNAL_ERROR);
> }
> @@ -2344,42 +2321,43 @@ static int vfio_initfn(PCIDevice *pdev)
> VFIOPCIDevice *vdev = DO_UPCAST(VFIOPCIDevice, pdev, pdev);
> VFIODevice *vbasedev_iter;
> VFIOGroup *group;
> - char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
> + char *tmp, group_path[PATH_MAX], *group_name;
> ssize_t len;
> struct stat st;
> int groupid;
> int ret;
>
> - /* Check that the host device exists */
> - snprintf(path, sizeof(path),
> - "/sys/bus/pci/devices/%04x:%02x:%02x.%01x/",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> - if (stat(path, &st) < 0) {
> - error_report("vfio: error: no such host device: %s", path);
> + if (!vdev->vbasedev.sysfsdev) {
> + vdev->vbasedev.sysfsdev =
> + g_strdup_printf("/sys/bus/pci/devices/%04x:%02x:%02x.%01x",
> + vdev->host.domain, vdev->host.bus,
> + vdev->host.slot, vdev->host.function);
> + }
> +
> + if (stat(vdev->vbasedev.sysfsdev, &st) < 0) {
> + error_report("vfio: error: no such host device: %s",
> + vdev->vbasedev.sysfsdev);
> return -errno;
> }
>
> + vdev->vbasedev.name = g_strdup(basename(vdev->vbasedev.sysfsdev));
> vdev->vbasedev.ops = &vfio_pci_ops;
> -
> vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI;
> - vdev->vbasedev.name = g_strdup_printf("%04x:%02x:%02x.%01x",
> - vdev->host.domain, vdev->host.bus,
> - vdev->host.slot,
> vdev->host.function);
>
> - strncat(path, "iommu_group", sizeof(path) - strlen(path) - 1);
> + tmp = g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev);
> + len = readlink(tmp, group_path, sizeof(group_path));
> + g_free(tmp);
>
> - len = readlink(path, iommu_group_path, sizeof(path));
> - if (len <= 0 || len >= sizeof(path)) {
> + if (len <= 0 || len >= sizeof(group_path)) {
> error_report("vfio: error no iommu_group for device");
> return len < 0 ? -errno : -ENAMETOOLONG;
> }
>
> - iommu_group_path[len] = 0;
> - group_name = basename(iommu_group_path);
> + group_path[len] = 0;
>
> + group_name = basename(group_path);
> if (sscanf(group_name, "%d", &groupid) != 1) {
> - error_report("vfio: error reading %s: %m", path);
> + error_report("vfio: error reading %s: %m", group_path);
> return -errno;
> }
>
> @@ -2391,21 +2369,18 @@ static int vfio_initfn(PCIDevice *pdev)
> return -ENOENT;
> }
>
> - snprintf(path, sizeof(path), "%04x:%02x:%02x.%01x",
> - vdev->host.domain, vdev->host.bus, vdev->host.slot,
> - vdev->host.function);
> -
> QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> if (strcmp(vbasedev_iter->name, vdev->vbasedev.name) == 0) {
> - error_report("vfio: error: device %s is already attached", path);
> + error_report("vfio: error: device %s is already attached",
> + vdev->vbasedev.name);
> vfio_put_group(group);
> return -EBUSY;
> }
> }
>
> - ret = vfio_get_device(group, path, &vdev->vbasedev);
> + ret = vfio_get_device(group, vdev->vbasedev.name, &vdev->vbasedev);
> if (ret) {
> - error_report("vfio: failed to get device %s", path);
> + error_report("vfio: failed to get device %s", vdev->vbasedev.name);
> vfio_put_group(group);
> return ret;
> }
> @@ -2622,6 +2597,7 @@ static void vfio_instance_init(Object *obj)
>
> static Property vfio_pci_dev_properties[] = {
> DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host),
> + DEFINE_PROP_STRING("sysfsdev", VFIOPCIDevice, vbasedev.sysfsdev),
> DEFINE_PROP_UINT32("x-intx-mmap-timeout-ms", VFIOPCIDevice,
> intx.mmap_timeout, 1100),
> DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
> diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c
> index 289b498..99f0642 100644
> --- a/hw/vfio/platform.c
> +++ b/hw/vfio/platform.c
> @@ -559,38 +559,45 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
> {
> VFIOGroup *group;
> VFIODevice *vbasedev_iter;
> - char path[PATH_MAX], iommu_group_path[PATH_MAX], *group_name;
> + char *tmp, group_path[PATH_MAX], *group_name;
> ssize_t len;
> struct stat st;
> int groupid;
> int ret;
>
> - /* name must be set prior to the call */
> - if (!vbasedev->name || strchr(vbasedev->name, '/')) {
> - return -EINVAL;
> - }
> + /* @sysfsdev takes precedence over @host */
> + if (vbasedev->sysfsdev) {
> + g_free(vbasedev->name);
> + vbasedev->name = g_strdup(basename(vbasedev->sysfsdev));
do we need the g_strdup here?
> + } else {
> + if (!vbasedev->name || strchr(vbasedev->name, '/')) {
> + return -EINVAL;
> + }
>
> - /* Check that the host device exists */
> - g_snprintf(path, sizeof(path), "/sys/bus/platform/devices/%s/",
> - vbasedev->name);
> + vbasedev->sysfsdev = g_strdup_printf("/sys/bus/platform/devices/%s",
> + vbasedev->name);
> + }
>
> - if (stat(path, &st) < 0) {
> - error_report("vfio: error: no such host device: %s", path);
> + if (stat(vbasedev->sysfsdev, &st) < 0) {
> + error_report("vfio: error: no such host device: %s",
> + vbasedev->sysfsdev);
> return -errno;
> }
>
> - g_strlcat(path, "iommu_group", sizeof(path));
> - len = readlink(path, iommu_group_path, sizeof(iommu_group_path));
> - if (len < 0 || len >= sizeof(iommu_group_path)) {
> + tmp = g_strdup_printf("%s/iommu_group", vbasedev->sysfsdev);
> + len = readlink(tmp, group_path, sizeof(group_path));
> + g_free(tmp);
> +
> + if (len < 0 || len >= sizeof(group_path)) {
> error_report("vfio: error no iommu_group for device");
> return len < 0 ? -errno : -ENAMETOOLONG;
> }
>
> - iommu_group_path[len] = 0;
> - group_name = basename(iommu_group_path);
> + group_path[len] = 0;
>
> + group_name = basename(group_path);
> if (sscanf(group_name, "%d", &groupid) != 1) {
> - error_report("vfio: error reading %s: %m", path);
> + error_report("vfio: error reading %s: %m", group_path);
> return -errno;
> }
>
> @@ -602,25 +609,24 @@ static int vfio_base_device_init(VFIODevice *vbasedev)
> return -ENOENT;
> }
>
> - g_snprintf(path, sizeof(path), "%s", vbasedev->name);
> -
> QLIST_FOREACH(vbasedev_iter, &group->device_list, next) {
> if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) {
> - error_report("vfio: error: device %s is already attached", path);
> + error_report("vfio: error: device %s is already attached",
> + vbasedev->name);
> vfio_put_group(group);
> return -EBUSY;
> }
> }
> - ret = vfio_get_device(group, path, vbasedev);
> + ret = vfio_get_device(group, vbasedev->name, vbasedev);
> if (ret) {
> - error_report("vfio: failed to get device %s", path);
> + error_report("vfio: failed to get device %s", vbasedev->name);
> vfio_put_group(group);
> return ret;
> }
>
> ret = vfio_populate_device(vbasedev);
> if (ret) {
> - error_report("vfio: failed to populate device %s", path);
> + error_report("vfio: failed to populate device %s", vbasedev->name);
> vfio_put_group(group);
> }
>
> @@ -680,7 +686,9 @@ static void vfio_platform_realize(DeviceState *dev, Error
> **errp)
> vbasedev->type = VFIO_DEVICE_TYPE_PLATFORM;
> vbasedev->ops = &vfio_platform_ops;
>
> - trace_vfio_platform_realize(vbasedev->name, vdev->compat);
> + trace_vfio_platform_realize(vbasedev->sysfsdev ?
> + vbasedev->sysfsdev : vbasedev->name,
> + vdev->compat);
>
> ret = vfio_base_device_init(vbasedev);
> if (ret) {
> @@ -702,6 +710,7 @@ static const VMStateDescription vfio_platform_vmstate = {
>
> static Property vfio_platform_dev_properties[] = {
> DEFINE_PROP_STRING("host", VFIOPlatformDevice, vbasedev.name),
> + DEFINE_PROP_STRING("sysfsdev", VFIOPlatformDevice, vbasedev.sysfsdev),
> DEFINE_PROP_BOOL("x-no-mmap", VFIOPlatformDevice, vbasedev.no_mmap,
> false),
> DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice,
> mmap_timeout, 1100),
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index f037f3c..7e00ffc 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -89,6 +89,7 @@ typedef struct VFIODeviceOps VFIODeviceOps;
> typedef struct VFIODevice {
> QLIST_ENTRY(VFIODevice) next;
> struct VFIOGroup *group;
> + char *sysfsdev;
> char *name;
> int fd;
> int type;
>