Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with v

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with v

From:	Amit Shah
Subject:	Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch
Date:	Thu, 1 Nov 2012 20:42:50 +0530

On (Thu) 01 Nov 2012 [14:04:11], Michael S. Tsirkin wrote:
> On Thu, Nov 01, 2012 at 02:49:18PM +0530, Amit Shah wrote:
> > On (Tue) 23 Oct 2012 [14:55:03], Stefan Hajnoczi wrote:
> > > On Mon, Oct 22, 2012 at 06:50:00AM -0700, Edivaldo de Araujo Pereira 
> > > wrote:
> > > > I didn't take enough time to uderstand the code, so unfortunately I 
> > > > fear there is not much I could do to solve the problem, apart from 
> > > > trying your suggestions. But I'll try to spend a little more time on 
> > > > it, until we find a solution.
> > > 
> > > I've thought a little about how to approach this.  Amit, here's a brain
> > > dump:
> > > 
> > > The simplest solution is to make virtqueue_avail_bytes() use the old
> > > behavior of stopping early.
> > > 
> > > However, I wonder if we can actually *improve* performance of existing
> > > code by changing virtio-net.c:virtio_net_receive().  The intuition is
> > > that calling virtio_net_has_buffers() (internally calls
> > > virtqueue_avail_bytes()) followed by virtqueue_pop() is suboptimal
> > > because we're repeatedly traversing the descriptor chain.
> > > 
> > > We can get rid of this repetition.  A side-effect of this is that we no
> > > longer need to call virtqueue_avail_bytes() from virtio-net.c.  Here's
> > > how:
> > > 
> > > The common case in virtio_net_receive() is that we have buffers and they
> > > are large enough for the received packet.  So to optimize for this case:
> > > 
> > > 1. Take the VirtQueueElement off the vring but don't increment
> > >    last_avail_idx yet.  (This is essentially a "peek" operation.)
> > > 
> > > 2. If there is an error or we drop the packet because the
> > >    VirtQueueElement is too small, just bail out and we'll grab the same
> > >    VirtQueueElement again next time.
> > > 
> > > 3. When we've committed filling in this VirtQueueElement, increment
> > >    last_avail_idx.  This is the point of no return.
> > > 
> > > Essentially we're splitting pop() into peek() and consume().  Peek()
> > > grabs the VirtQueueElement but does not increment last_avail_idx.
> > > Consume() simply increments last_avail_idx and maybe the EVENT_IDX
> > > optimization stuff.
> > > 
> > > Whether this will improve performance, I'm not sure.  Perhaps
> > > virtio_net_has_buffers() pulls most descriptors into the CPU's cache and
> > > following up with virtqueue_pop() is very cheap already.  But the idea
> > > here is to avoid the virtio_net_has_buffers() because we'll find out
> > > soon enough when we try to pop :).
> > 
> > This sounds doable -- adding mst for comments.
> > 
> > > Another approach would be to drop virtio_net_has_buffers() but continue
> > > to use virtqueue_pop().  We'd keep the same VirtQueueElem stashed in
> > > VirtIONet across virtio_net_receive() calls in the case where we drop
> > > the packet.  I don't like this approach very much though because it gets
> > > tricky when the guest modifies the vring memory, resets the virtio
> > > device, etc across calls.
> > 
> > Right.
> > 
> > Also, save/load will become slightly complicated in both these
> > cases, but it might be worth it.
> > 
> > Michael, can you comment pls?
> > 
> 
> It will also complicate switching to/from vhost-net.
> 
> If this patch helps serial but degrades speed for -net I'm inclined
> to simply make serial and net user different codepaths.

There's an opportunity for optimisation, let's not discount it so
quickly.  The reporter also pointed out there was ~ +20% difference
with tun/tap, so eliminating the call to virtqueue_avail_bytes() can
help overall.

                Amit

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch, Amit Shah, 2012/11/01
- Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch, Michael S. Tsirkin, 2012/11/01
  - Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch, Amit Shah <=
    - Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch, Michael S. Tsirkin, 2012/11/01
  - [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Michael S. Tsirkin, 2012/11/01
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Amit Shah, 2012/11/02
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Stefan Hajnoczi, 2012/11/02
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Michael S. Tsirkin, 2012/11/02
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Stefan Hajnoczi, 2012/11/02
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Michael S. Tsirkin, 2012/11/27
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Edivaldo de Araujo Pereira, 2012/11/27
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Anthony Liguori, 2012/11/27
    - Re: [Qemu-devel] [PATCH] virtio: limit avail bytes lookahead, Michael S. Tsirkin, 2012/11/28

Prev by Date: [Qemu-devel] [PATCH 08/29] pci: use qidl_declare'd properties for TYPE_PCI_DEVICE
Next by Date: Re: [Qemu-devel] [PATCH 27/27] pc: generate APIC IDs according to CPU topology
Previous by thread: Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch
Next by thread: Re: [Qemu-devel] [Bug 1066055] Re: Network performance regression with vde_switch
Index(es):
- Date
- Thread