Re: [Qemu-devel] Proposed patch: huge RX speedup for hw/e1000.c

On Wed, May 30, 2012 at 11:39 PM, Jan Kiszka <address@hidden> wrote:

Please keep CCs.

On 2012-05-30 23:23, Luigi Rizzo wrote:
>> On Wed, May 30, 2012 at 10:23:11PM +0200, Luigi Rizzo wrote:
> ...
>>> The problem was fixed by the following one-line addition to
>>> hw/e1000.c :: e1000_mmio_write() , to wakeup the qemu mainloop and
>>> check that some buffers might be available.
>>>
>>> --- hw/e1000.c.orig 2012-02-17 20:45:39.000000000 +0100
>>> +++ hw/e1000.c 2012-05-30 20:01:52.000000000 +0200
>>> @@ -919,6 +926,7 @@
>>> DBGOUT(UNKNOWN, "MMIO unknown write addr=0x%08x,val=0x%08"PRIx64"\n",
>>> index<<2, val);
>>> }
>>> + qemu_notify_event();
>>> }
>>>
>>> static uint64_t
>>>
>>> With this fix, the read throughput reaches 1 Mpps matching the write
>>> speed. Now the system becomes CPU-bound, but this opens the way to
>>> more optimizations in the emulator.
>>>
>>> The same problem seems to exist on other network drivers, e.g.
>>> hw/rtl8139.c and others. The only one that seems to get it
>>> right is virtio-net.c
>>>
>>> I think it would be good if this change could make it into
>>> the tree.
>>>
>>> [Note 1] Netmap ( http://info.iet.unipi.it/~luigi/netmap )
>>> is an efficient mechanism for packet I/O that bypasses
>>> the network stack and provides protected access to the
>>> network adapter from userspace.
>>> It works especially well on top of qemu because the
>>> kernel needs only to trap a single register access
>>> for each batch of packets.
>>>
>>> [Note 2] the custom backend is a virtual local ethernet
>>> called VALE, implemented as a kernel module on the host,
>>> that extends netmap to implement communication
>>> between virtual machines.
>>> VALE is extremely efficient, currently delivering about
>>> 10~Mpps with 60-byte frames, and 5~Mpps with 1500-byte frames.
>>> The 1 Mpps rates i mentioned are obtained between qemu instances
>>> running in userspace on FreeBSD (no kernel acceleration whatsoever)
>>> and using VALE as a communication mechanism.
>>
>> "Custom backend" == you patched QEMU? Or what backend are you using?
>>
>> This sounds a lot like [1] and suggests that you are either a) using
>> slirp in a version that doesn't contain that fix yet (before 1.1-rcX) or
>> b) wrote a backend that suffers from a similar bug.
>>
>> Jan
>>
>> [1] http://thread.gmane.org/gmane.comp.emulators.qemu/144433
>
> my custom backend is the one in [Note 2] above.
> It replaces the -net pcap/user/tap/socket option which defines
> how qemu communicate with the host network device.

Any code to share? It's hard to discuss just concepts.

you can take the freebsd image from the netmap page in my link and run it

in qemu, and then run the pkt-gen program in the image in either

send or receive mode. But this is overkill, as you have described the

problem exactly in your post: when the guest reads the packets from

the emulated device (e1000 in my case, but most of them have the

problem) it fails to wake up the thread blocked in main_loop_wait().

I am unclear on the terminology (what is frontend and what is backend ?)

but it is the guest side that has to wake up the qemu process: the file

descriptor talking to the host side (tap, socket, bpf ...) has already

fired its events and the only thing it could do is cause a busy wait

if it keeps passing a readable file descriptor to select.

I thought your slirp.c patch was also on the same "side" as e1000.c

cheers

luigi

>
> The problem is not in my module, but rather in the emulation
> device exposed to the guest, and i presume this is the same thing
> you fixed in the "slirp" patch.
> I checked the git version http://git.qemu.org/qemu.git
> and most guest-side devices have the same problem,
> only virtio-net does the notification.

And that is most likely wrong. The bug I cited was not a front-end issue
but clearly one of the backend. It lacked kicking of the io-thread once
its queue state changed in a way that was not reported otherwise (via
some file descriptor the io-thread is subscribed to). If your backend
creates such states as well, it has to fix it similarly.

Again, discussing this abstractly is not very efficient.

Jan

From:	Luigi Rizzo
Subject:	Re: [Qemu-devel] Proposed patch: huge RX speedup for hw/e1000.c
Date:	Wed, 30 May 2012 23:55:16 +0200