[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support

From:	Anthony Liguori
Subject:	[Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Date:	Fri, 26 Feb 2010 09:18:03 -0600
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.5) Gecko/20091209 Fedora/3.0-4.fc12 Lightning/1.0pre Thunderbird/3.0

On 02/26/2010 08:49 AM, Michael S. Tsirkin wrote:


KVM code needs all kind of work-arounds for KVM specific issues.
It also assumes that KVM is registered at startup, so it
does not try to optimize finding slots.

No, the slot mapping changes dynamically so KVM certainly needs tooptimize this.

But the point is, why can't we keep a central list of slots somewherethat KVM and vhost-net can both use? I'm not saying we use a commonfunction to do this work, I'm saying qemu should maintain a proper slotlist than anyone can access.

I propose merging this as is, and then someone who has an idea
how to do this better can come and unify the code.

Like I said, this has been a huge source of very subtle bugs in thepast. I'm open to hearing what other people think, but I'm concernedthat if we merge this code, we'll end up facing some nasty bugs thatcould easily be eliminated by just using the code in kvm-all that hasalready been tested rather extensively.

There really aren't that many work-arounds in the code BTW. The workarounds just result in a couple of extra slots too so they shouldn't bea burden to vhost.

Mine has no bugs, let's switch to it!

Seriously, need to tread very carefully here.
This is why I say: merge it, then look at how to reuse code.

Once it's merged, there's no incentive to look at reusing code. Again,I don't think this is a huge burden to vhost. The two bits of codeliterally do exactly the same thing. They just use different datastructures that ultimately contain the same values.

C++ habits die hard :-)


What's that about?

'++i' is an odd thing to do in C in a for() loop. We're not explicitabout it in Coding Style but the vast majority of code just does 'i++'.

+    vq->desc = cpu_physical_memory_map(a,&l, 0);
+    if (!vq->desc || l != s) {
+        r = -ENOMEM;
+        goto fail_alloc;
+    }
+    s = l = offsetof(struct vring_avail, ring) +
+        sizeof(u_int64_t) * vq->num;
+    a = virtio_queue_get_avail(vdev, idx);
+    vq->avail = cpu_physical_memory_map(a,&l, 0);
+    if (!vq->avail || l != s) {
+        r = -ENOMEM;
+        goto fail_alloc;
+    }

You don't unmap avail/desc on failure.  map() may fail because the ring
cross MMIO memory and you run out of a bounce buffer.

IMHO, it would be better to attempt to map the full ring at once and
then if that doesn't succeed, bail out.  You can still pass individual
pointers via vhost ioctls but within qemu, it's much easier to deal with
the whole ring at a time.

+ a = virtio_queue_get_desc(vdev, idx);
I prefer to keep as much logic about ring layout as possible
in virtio.c

Well, the downside is that you need to deal with the error path andcleanup paths and it becomes more complicated.

+    s = l = offsetof(struct vring_used, ring) +
+        sizeof(struct vring_used_elem) * vq->num;

This is unfortunate.  We redefine this structures in qemu to avoid
depending on Linux headers.

And we should for e.g. windows portability.

  But you're using the linux versions instead
of the qemu versions.  Is it really necessary for vhost.h to include
virtio.h?

Yes. And anyway, vhost does not exist on non-linux systems so there
is no issue IMO.

Yeah, like I said, it's unfortunate because it means a read of vhost anda reader of virtio.c is likely to get confused. I'm not saying there'san easy solution, it's just unfortunate.

+    vq->used_phys = a = virtio_queue_get_used(vdev, idx);
+    vq->used = cpu_physical_memory_map(a,&l, 1);
+    if (!vq->used || l != s) {
+        r = -ENOMEM;
+        goto fail_alloc;
+    }
+
+    r = vhost_virtqueue_set_addr(dev, vq, idx, dev->log_enabled);
+    if (r<   0) {
+        r = -errno;
+        goto fail_alloc;
+    }
+    if (!vdev->binding->guest_notifier || !vdev->binding->host_notifier) {
+        fprintf(stderr, "binding does not support irqfd/queuefd\n");
+        r = -ENOSYS;
+        goto fail_alloc;
+    }
+    r = vdev->binding->guest_notifier(vdev->binding_opaque, idx, true);
+    if (r<   0) {
+        fprintf(stderr, "Error binding guest notifier: %d\n", -r);
+        goto fail_guest_notifier;
+    }
+
+    r = vdev->binding->host_notifier(vdev->binding_opaque, idx, true);
+    if (r<   0) {
+        fprintf(stderr, "Error binding host notifier: %d\n", -r);
+        goto fail_host_notifier;
+    }
+
+    file.fd = event_notifier_get_fd(virtio_queue_host_notifier(q));
+    r = ioctl(dev->control, VHOST_SET_VRING_KICK,&file);
+    if (r) {
+        goto fail_kick;
+    }
+
+    file.fd = event_notifier_get_fd(virtio_queue_guest_notifier(q));
+    r = ioctl(dev->control, VHOST_SET_VRING_CALL,&file);
+    if (r) {
+        goto fail_call;
+    }

This function would be a bit more reasonable if it were split into
sections FWIW.

Not sure what do you mean here.

Just a suggestion. For instance, moving the setting up of the notifiersto a separate function would help with readability IMHO.

You never unmap() the mapped memory and you're cheating by assuming that
the virtio rings have a constant mapping for the life time of a guest.
That's not technically true.  My concern is that since a guest can
trigger remappings (by adjusting PCI mappings) badness can ensue.

I do not know how this can happen. What do PCI mappings have to do with this?
Please explain. If it can, vhost will need notification to update.

If a guest modifies the bar for an MMIO region such that it happens toexist in RAM, while this is a bad thing for the guest to do, I don'tthink we do anything to stop it. When the region gets remapped, theresult will be that the mapping will change.

Within qemu, because we carry the qemu_mutex, we know that the mappingsare fixed as long as we're in qemu. We're very careful to assume thatwe don't rely on a mapping past when we drop the qemu_mutex.

With vhost, you register a slot table and update it whenever mappingschange. I think that's good enough for dealing with ram addresses. Butyou pass the virtual address for the rings and assume those mappingsnever change.

I'm pretty sure a guest can cause those to change and I'm not 100% sure,but I think it's a potential source of exploits if you assume amapping. In the very least, a guest can trick vhost into writing to ramthat it wouldn't normally write to.

If you're going this way, I'd suggest making static inlines in the
header file instead of polluting the C file.  It's more common to search
within a C file and having two declarations can get annoying.

Regards,

Anthony Liguori

The issue with inline is that this means that virtio net will depend on
target (need to be recompiled).  As it is, a single object can link with
vhost and non-vhost versions.


Fair enough.

Regards,

Anthony Liguori

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCHv2 05/12] virtio: add APIs for queue fields, (continued)
- [Qemu-devel] [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/25
  - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Juan Quintela, 2010/02/25
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/26
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Anthony Liguori, 2010/02/26
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/26
  - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Anthony Liguori, 2010/02/25
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/26
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Anthony Liguori <=
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/27
    - Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Paul Brook, 2010/02/27
    - Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/28
    - Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Paul Brook, 2010/02/28
    - Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/28
    - Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Paul Brook, 2010/02/28
    - Re: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Michael S. Tsirkin, 2010/02/28
    - [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support, Anthony Liguori, 2010/02/28
- [Qemu-devel] [PATCHv2 02/12] kvm: add API to set ioeventfd, Michael S. Tsirkin, 2010/02/25
  - [Qemu-devel] Re: [PATCHv2 02/12] kvm: add API to set ioeventfd, Anthony Liguori, 2010/02/25

Prev by Date: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Next by Date: [Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options
Previous by thread: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Next by thread: [Qemu-devel] Re: [PATCHv2 09/12] vhost: vhost net support
Index(es):
- Date
- Thread