[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] RFC: handling "backend too fast" in virtio-net
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] RFC: handling "backend too fast" in virtio-net |
Date: |
Fri, 15 Feb 2013 11:24:29 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Thu, Feb 14, 2013 at 07:21:57PM +0100, Luigi Rizzo wrote:
CCed Michael Tsirkin
> virtio-style network devices (where the producer and consumer chase
> each other through a shared memory block) can enter into a
> bad operating regime when the consumer is too fast.
>
> I am hitting this case right now when virtio is used on top of the
> netmap/VALE backend that I posted a few weeks ago: what happens is that
> the backend is so fast that the io thread keeps re-enabling notifications
> every few packets. This results, on my test machine, in a throughput of
> 250-300Kpps (and extremely unstable, oscillating between 200 and 600Kpps).
>
> I'd like to get some feedback on the following trivial patch to have
> the thread keep spinning for a bounded amount of time when the producer
> is slower than the consumer. This gives a relatively stable throughput
> between 700 and 800 Kpps (we have something similar in our paravirtualized
> e1000 driver, which is slightly faster at 900-1100 Kpps).
Did you experiment with tx timer instead of bh? It seems that
hw/virtio-net.c has two tx mitigation strategies - the bh approach that
you've tweaked and a true timer.
It seems you don't really want tx batching but you do want to avoid
guest->host notifications?
> EXISTING LOGIC: reschedule the bh when at least a burst of packets has
> been received. Otherwise enable notifications and do another
> race-prevention check.
>
> NEW LOGIC: when the bh is scheduled the first time, establish a
> budget (currently 20) of reschedule attempts.
> Every time virtio_net_flush_tx() returns 0 packets [this could
> be changed to 'less than a full burst'] the counter is increased.
> The bh is rescheduled until the counter reaches the budget,
> at which point we re-enable notifications as above.
>
> I am not 100% happy with this patch because there is a "magic"
> constant (the maximum number of retries) which should be really
> adaptive, or perhaps we should even bound the amount of time the
> bh runs, rather than the number of retries.
> In practice, having the thread spin (or sleep) for 10-20us
> even without new packets is probably preferable to
> re-enable notifications and request a kick.
>
> opinions ?
>
> cheers
> luigi
>
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index 573c669..5389088 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -49,6 +51,7 @@ typedef struct VirtIONet
> NICState *nic;
> uint32_t tx_timeout;
> int32_t tx_burst;
> + int32_t tx_retries; /* keep spinning a bit on tx */
> uint32_t has_vnet_hdr;
> size_t host_hdr_len;
> size_t guest_hdr_len;
> @@ -1062,7 +1065,9 @@ static void virtio_net_tx_bh(void *opaque)
>
> /* If we flush a full burst of packets, assume there are
> * more coming and immediately reschedule */
> - if (ret >= n->tx_burst) {
> + if (ret == 0)
> + n->tx_retries++;
> + if (n->tx_retries < 20) {
> qemu_bh_schedule(q->tx_bh);
> q->tx_waiting = 1;
> return;
> @@ -1076,6 +1082,8 @@ static void virtio_net_tx_bh(void *opaque)
> virtio_queue_set_notification(q->tx_vq, 0);
> qemu_bh_schedule(q->tx_bh);
> q->tx_waiting = 1;
> + } else {
> + n->tx_retries = 0;
> }
> }
>
>
>