[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [kvm-devel] [Qemu-devel] [PATCH 3/6] virtio for QEMU

From: Anthony Liguori
Subject: Re: [kvm-devel] [Qemu-devel] [PATCH 3/6] virtio for QEMU
Date: Sun, 30 Mar 2008 17:59:05 -0500
User-agent: Thunderbird (X11/20080227)

Dor Laor wrote:
On Sat, 2008-03-29 at 16:55 -0500, Anthony Liguori wrote:
This patch introduces virtio support over PCI.  virtio is a generic virtual IO
framework for Linux first introduced in 2.6.23.  Since 2.6.25, virtio has
supported a PCI transport which this patch implements.

Since the last time these patches were posted to qemu-devel, I've reworked it
to use the proper access functions to manipulate guest memory.

Signed-off-by: Anthony Liguori <address@hidden>

It's will be great to drop the nasty hacks :)
Do you still get 1G net performance using the extra copy from tap

We take a bit of a hit (probably 10-20%) doing copy. The "tap hacks" require a more invasive set of patches to refactor the VLAN support in QEMU. The fundamental problem with tap is that it only supports one tap device per VLAN. What we really want is a VLAN were each VLAN client has it's own tap device. This is also necessary to properly support the upcoming ring queue support for TAP along with GSO.

That patch set is why I revisited this one in fact :-) Once we get virtio merged, I'll then send out the VLAN refactoring. The nice thing though is that once we have the VLAN refactoring, we can optimize the e1000 device to make use of it.


+static uint32_t vring_desc_len(VirtQueue *vq, unsigned int i)

Below there were place you did use offsetof(vq->vring.desc[i], len) so
we better be consistent + its nicer

+ return ldl_phys(vq->vring.desc + i * sizeof(VRingDesc) + + offsetof(VRingDesc, len));

Yup, I just missed this one.  Thanks for the catch!

+VirtQueueElement *virtqueue_pop(VirtQueue *vq)
+    unsigned int i, head;
+    unsigned int position;
+    VirtQueueElement *elem;
+    /* Check it isn't doing very strange things with descriptor numbers. */
+    if ((uint16_t)(vring_avail_idx(vq) - vq->last_avail_idx) > vq->vring.num)
+       errx(1, "Guest moved used index from %u to %u",
+            vq->last_avail_idx, vring_avail_idx(vq));
+    /* If there's nothing new since last we looked, return invalid. */
+    if (vring_avail_idx(vq) == vq->last_avail_idx)
+       return NULL;
+    /* Grab the next descriptor number they're advertising, and increment
+     * the index we've seen. */
+    head = vring_avail_ring(vq, vq->last_avail_idx++ % vq->vring.num);
+    /* If their number is silly, that's a fatal mistake. */
+    if (head >= vq->vring.num)
+       errx(1, "Guest says index %u is available", head);
+    /* When we start there are none of either input nor output. */
+    position = 0;
+    elem = qemu_mallocz(sizeof(VirtQueueElement));
+    elem->phys_in = qemu_mallocz(sizeof(PhysIOVector) +
+                                vq->vring.num * sizeof(PhysIOVectorElement));
+    elem->phys_out = qemu_mallocz(sizeof(PhysIOVector) +
+                                 vq->vring.num * sizeof(PhysIOVectorElement));

I was wondering whether it can be optimized since vring.num is sometimes
512 so and we can either use a pool of these or calculate the vring.num
from the descriptors but it seems like your way is the best.

My thinking right now is to use qemu_mallocz() for everything and then we can go back and optimize with pooling if necessary.

+    i = head;
+    do {
+       PhysIOVectorElement *sge;
+       if (vring_desc_flags(vq, i) & VRING_DESC_F_WRITE)
+           sge = &elem->phys_in->sg[elem->phys_in->num++];
+       else
+           sge = &elem->phys_out->sg[elem->phys_out->num++];
+       /* Grab the first descriptor, and check it's OK. */
+       sge->len = vring_desc_len(vq, i);
+       sge->base = vring_desc_addr(vq, i);
+       /* If we've got too many, that implies a descriptor loop. */
+       if ((elem->phys_in->num + elem->phys_out->num) > vq->vring.num)
+           errx(1, "Looped descriptor");
+    } while ((i = virtqueue_next_desc(vq, i)) != vq->vring.num);
+    elem->virt_in = pci_device_dma_map(&vq->vdev->pci_dev, elem->phys_in);
+    elem->virt_out = pci_device_dma_map(&vq->vdev->pci_dev, elem->phys_out);
+    elem->index = head;
+    if (elem->virt_in == NULL || elem->virt_out == NULL)
+       errx(1, "Bad DMA");
+    return elem;

The name below is a bit misleading since when enable is true you
actually set no_notify.
So I name it something like virtio_vring_set_no_notify(...) or similar.

Yeah, that's not a bad suggestion.


Anthony Liguori

+void virtio_ring_set_used_notify(VirtQueue *vq, int enable)
+    if (enable)
+       vring_used_set_flag(vq, VRING_USED_F_NO_NOTIFY);
+    else
+       vring_used_unset_flag(vq, VRING_USED_F_NO_NOTIFY);


Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
kvm-devel mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]