qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature


From: Maxime Coquelin
Subject: Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature
Date: Thu, 29 Sep 2016 22:05:22 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0



On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
On Thu, Sep 29, 2016 at 05:30:53PM +0200, Maxime Coquelin wrote:
...

Before enabling anything by default, we should first optimize the 1 slot
case. Indeed, micro-benchmark using testpmd in txonly[0] shows ~17%
perf regression for 64 bytes case:
 - 2 descs per packet: 11.6Mpps
 - 1 desc per packet: 9.6Mpps

This is due to the virtio header clearing in virtqueue_enqueue_xmit().
Removing it, we get better results than with 2 descs (1.20Mpps).
Since the Virtio PMD doesn't support offloads, I wonder whether we can
just drop the memset?

What will happen? Will the header be uninitialized?
Yes..
I didn't look closely at the spec, but just looked at DPDK's and Linux
vhost implementations. IIUC, the header is just skipped in the two
implementations.

The spec says:
        The driver can send a completely checksummed packet. In this case, flags
        will be zero, and gso_type
        will be VIRTIO_NET_HDR_GSO_NONE.

and
        The driver MUST set num_buffers to zero.
        If VIRTIO_NET_F_CSUM is not negotiated, the driver MUST set flags to
        zero and SHOULD supply a fully
        checksummed packet to the device.

and
        If none of the VIRTIO_NET_F_HOST_TSO4, TSO6 or UFO options have been
        negotiated, the driver MUST
        set gso_type to VIRTIO_NET_HDR_GSO_NONE.

so doing this unconditionally would be a spec violation, but if you see
value in this, we can add a feature bit.
Right it would be a spec violation, so it should be done conditionally.
If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER?
It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set.
If negotiated, we wouldn't need to prepend a header.

From the micro-benchmarks results, we can expect +10% compared to
indirect descriptors, and + 5% compared to using 2 descs in the
virtqueue.
Also, it should have the same benefits as indirect descriptors for 0%
pkt loss (as we can fill 2x more packets in the virtqueue).

What do you think?

Thanks,
Maxime



reply via email to

[Prev in Thread] Current Thread [Next in Thread]