[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device
From: |
Kirti Wankhede |
Subject: |
Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device |
Date: |
Sat, 25 Jun 2016 00:04:27 +0530 |
Thanks Alex.
On 6/22/2016 4:18 AM, Alex Williamson wrote:
> On Mon, 20 Jun 2016 22:01:47 +0530
> Kirti Wankhede <address@hidden> wrote:
>
>> +
>> +static int get_mdev_region_info(struct mdev_device *mdev,
>> + struct pci_region_info *vfio_region_info,
>> + int index)
>> +{
>> + int ret = -EINVAL;
>> + struct parent_device *parent = mdev->parent;
>> +
>> + if (parent && dev_is_pci(parent->dev) && parent->ops->get_region_info) {
>> + mutex_lock(&mdev->ops_lock);
>> + ret = parent->ops->get_region_info(mdev, index,
>> + vfio_region_info);
>> + mutex_unlock(&mdev->ops_lock);
>
> Why do we have two ops_lock, one on the parent_device and one on the
> mdev_device?! Is this one actually locking anything or also just
> providing serialization? Why do some things get serialized at the
> parent level and some things at the device level? Very confused by
> ops_lock.
>
There are two sets of callback:
* parent device callbacks: supported_config, create, destroy, start, stop
* mdev device callbacks: read, write, set_irqs, get_region_info,
validate_map_request
parent->ops_lock is to serialize per parent device callbacks.
mdev->ops_lock is to serialize per mdev device callbacks.
I'll add above comment in mdev.h.
>> +
>> +static int mdev_get_irq_count(struct vfio_mdev *vmdev, int irq_type)
>> +{
>> + /* Don't support MSIX for now */
>> + if (irq_type == VFIO_PCI_MSIX_IRQ_INDEX)
>> + return -1;
>> +
>> + return 1;
>
> Too much hard coding here, the mediated driver should define this.
>
I'm testing INTX and MSI, I don't have a way to test MSIX for now. So we
thought we can add supported for MSIX later. Till then hard code it to 1.
>> +
>> + if (parent && parent->ops->set_irqs) {
>> + mutex_lock(&mdev->ops_lock);
>> + ret = parent->ops->set_irqs(mdev, hdr.flags, hdr.index,
>> + hdr.start, hdr.count, data);
>> + mutex_unlock(&mdev->ops_lock);
>
> Device level serialization on set_irqs... interesting.
>
Hope answer above helps to clarify this.
>> + }
>> +
>> + kfree(ptr);
>> + return ret;
>> + }
>> + }
>> + return -ENOTTY;
>> +}
>> +
>> +ssize_t mdev_dev_config_rw(struct vfio_mdev *vmdev, char __user *buf,
>> + size_t count, loff_t *ppos, bool iswrite)
>> +{
>> + struct mdev_device *mdev = vmdev->mdev;
>> + struct parent_device *parent = mdev->parent;
>> + int size = vmdev->vfio_region_info[VFIO_PCI_CONFIG_REGION_INDEX].size;
>> + int ret = 0;
>> + uint64_t pos = *ppos & VFIO_PCI_OFFSET_MASK;
>> +
>> + if (pos < 0 || pos >= size ||
>> + pos + count > size) {
>> + pr_err("%s pos 0x%llx out of range\n", __func__, pos);
>> + ret = -EFAULT;
>> + goto config_rw_exit;
>> + }
>> +
>> + if (iswrite) {
>> + char *usr_data, *ptr;
>> +
>> + ptr = usr_data = memdup_user(buf, count);
>> + if (IS_ERR(usr_data)) {
>> + ret = PTR_ERR(usr_data);
>> + goto config_rw_exit;
>> + }
>> +
>> + ret = parent->ops->write(mdev, usr_data, count,
>> + EMUL_CONFIG_SPACE, pos);
>
> No serialization on this ops, thank goodness, but why?
>
Its there at caller of mdev_dev_rw().
> This read/write interface still seems strange to me...
>
Replied on this in 1st Patch.
>> +
>> + memcpy((void *)(vmdev->vconfig + pos), (void *)usr_data, count);
>> + kfree(ptr);
>> + } else {
>> + char *ret_data, *ptr;
>> +
>> + ptr = ret_data = kzalloc(count, GFP_KERNEL);
>> +
>> + if (IS_ERR(ret_data)) {
>> + ret = PTR_ERR(ret_data);
>> + goto config_rw_exit;
>> + }
>> +
>> + ret = parent->ops->read(mdev, ret_data, count,
>> + EMUL_CONFIG_SPACE, pos);
>> +
>> + if (ret > 0) {
>> + if (copy_to_user(buf, ret_data, ret))
>> + ret = -EFAULT;
>> + else
>> + memcpy((void *)(vmdev->vconfig + pos),
>> + (void *)ret_data, count);
>> + }
>> + kfree(ptr);
>
> So vconfig caches all of config space for the mdev, but we only ever
> use it to read the BAR address via mdev_read_base()... why? I hope the
> mdev driver doesn't freak out if the user reads the mmio region before
> writing a base address (remember the vfio API aspect of the interface
> doesn't necessarily follow the VM PCI programming API)
>
How could user read mmio region from guest before writing base address?
Isn't that would be a bug?
>From our driver if pos is not within the base address range, then we
return error for read/write.
>> + }
>> +config_rw_exit:
>> +
>> + if (ret > 0)
>> + *ppos += ret;
>> +
>> + return ret;
>> +}
>> +
>> +ssize_t mdev_dev_bar_rw(struct vfio_mdev *vmdev, char __user *buf,
>> + size_t count, loff_t *ppos, bool iswrite)
>> +{
>> + struct mdev_device *mdev = vmdev->mdev;
>> + struct parent_device *parent = mdev->parent;
>> + loff_t offset = *ppos & VFIO_PCI_OFFSET_MASK;
>> + loff_t pos;
>> + int bar_index = VFIO_PCI_OFFSET_TO_INDEX(*ppos);
>> + int ret = 0;
>> +
>> + if (!vmdev->vfio_region_info[bar_index].start)
>> + mdev_read_base(vmdev);
>> +
>> + if (offset >= vmdev->vfio_region_info[bar_index].size) {
>> + ret = -EINVAL;
>> + goto bar_rw_exit;
>> + }
>> +
>> + count = min(count,
>> + (size_t)(vmdev->vfio_region_info[bar_index].size - offset));
>> +
>> + pos = vmdev->vfio_region_info[bar_index].start + offset;
>
> In the case of a mpci dev, @start is the vconfig BAR value, so it's
> user (guest) writable, and the mediated driver is supposed to
> understand that? I suppose is saw the config write too, if there was
> one, but the mediated driver gives us region info based on region index.
> We have the region index here. Why wouldn't we do reads and writes
> based on region index and offset and eliminate vconfig? Seems like
> that would consolidate a lot of this, we don't care what we're reading
> and writing, just pass it through. Mediated pci drivers would simply
> need to match indexes to those already defined for vfio-pci.
>
Ok, looking at it. so this will remove vconfig completely.
Thanks,
Kirti.
Re: [Qemu-devel] [PATCH 2/3] VFIO driver for mediated PCI device, Xiao Guangrong, 2016/06/30
[Qemu-devel] [PATCH 1/3] Mediated device Core driver, Kirti Wankhede, 2016/06/20