qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] virtio-net: fix bottom-half packet TX on asynchronous co


From: Laurent Vivier
Subject: Re: [PATCH RFC] virtio-net: fix bottom-half packet TX on asynchronous completion
Date: Fri, 14 Oct 2022 11:23:30 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1

On 10/14/22 09:06, Laurent Vivier wrote:
On 10/14/22 05:10, Jason Wang wrote:
On Thu, Oct 13, 2022 at 10:48 PM Michael S. Tsirkin <mst@redhat.com> wrote:

On Thu, Oct 13, 2022 at 04:00:57PM +0200, Laurent Vivier wrote:
When virtio-net is used with the socket netdev backend, the backend
can be busy and not able to collect new packets.

In this case, net_socket_receive() returns 0 and registers a poll function
to detect when the socket is ready again.

In virtio_net_tx_bh(), virtio_net_flush_tx() forwards the 0, the virtio
notifications are disabled and the function is not rescheduled, waiting
for the backend to be ready.

When the socket netdev backend is again able to send packets, the poll
function re-starts to flush remaining packets. This is done by
calling virtio_net_tx_complete(). It re-enables notifications and calls
again virtio_net_flush_tx().

But it seems if virtio_net_flush_tx() reaches the tx_burst value all
the queue is not flushed and no new notification is sent to reschedule
virtio_net_tx_bh(). Nothing re-start to flush the queue and remaining packets
are stuck in the queue.

To fix that, detect in virtio_net_tx_complete() if virtio_net_flush_tx()
has been stopped by tx_burst and if yes reschedule the bottom half
function virtio_net_tx_bh() to flush the remaining packets.

This is what is done in virtio_net_tx_bh() when the virtio_net_flush_tx()
is synchronous, and completely by-passed when the operation needs to be
asynchronous.

RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC

Do we need to reschedule the function in the other case managed
in virtio_net_tx_bh() and by-passed when the completion is asynchronous?


I am guessing no.

     /* If less than a full burst, re-enable notification and flush
      * anything that may have come in while we weren't looking.  If
      * we find something, assume the guest is still active and reschedule */
     virtio_queue_set_notification(q->tx_vq, 1);
     ret = virtio_net_flush_tx(q);
     if (ret == -EINVAL) {
         return;
     } else if (ret > 0) {
         virtio_queue_set_notification(q->tx_vq, 0);
         qemu_bh_schedule(q->tx_bh);
         q->tx_waiting = 1;
     }

RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC RFC

Fixes: a697a334b3c4 ("virtio-net: Introduce a new bottom half packet TX")
Cc: alex.williamson@redhat.com
Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Looks ok superficially

Acked-by: Michael S. Tsirkin <mst@redhat.com>

Jason, your area.

---
  hw/net/virtio-net.c | 13 ++++++++++++-
  1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index e9f696b4cfeb..1fbf2f3e19a7 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2526,6 +2526,7 @@ static void virtio_net_tx_complete(NetClientState *nc, 
ssize_t len)
      VirtIONet *n = qemu_get_nic_opaque(nc);
      VirtIONetQueue *q = virtio_net_get_subqueue(nc);
      VirtIODevice *vdev = VIRTIO_DEVICE(n);
+    int ret;

      virtqueue_push(q->tx_vq, q->async_tx.elem, 0);
      virtio_notify(vdev, q->tx_vq);
@@ -2534,7 +2535,17 @@ static void virtio_net_tx_complete(NetClientState *nc, ssize_t len)
      q->async_tx.elem = NULL;

      virtio_queue_set_notification(q->tx_vq, 1);
-    virtio_net_flush_tx(q);
+    ret = virtio_net_flush_tx(q);
+    if (q->tx_bh && ret >= n->tx_burst) {
+        /*
+         * the flush has been stopped by tx_burst
+         * we will not receive notification for the
+         * remainining part, so re-schedule
+         */
+        virtio_queue_set_notification(q->tx_vq, 0);
+        qemu_bh_schedule(q->tx_bh);
+        q->tx_waiting = 1;
+    }

Do we need to fix the case of tx timer or it doesn't suffer from this issue?


I think tx_timer suffers the same issue.

I'm going to have a look to the tx timer fix.

It seems worst with tx timer as we have the tx_burst problem even in the synchronous case as the timer is not rearmed.

Thanks,
Laurent




reply via email to

[Prev in Thread] Current Thread [Next in Thread]