qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net


From: Jason Wang
Subject: Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net
Date: Thu, 5 Nov 2020 11:46:18 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0


On 2020/11/4 下午5:31, Daniel P. Berrangé wrote:
On Wed, Nov 04, 2020 at 10:07:52AM +0800, Jason Wang wrote:
On 2020/11/3 下午6:32, Yuri Benditovich wrote:

On Tue, Nov 3, 2020 at 11:02 AM Jason Wang <jasowang@redhat.com
<mailto:jasowang@redhat.com>> wrote:


     On 2020/11/3 上午2:51, Andrew Melnychenko wrote:
     > Basic idea is to use eBPF to calculate and steer packets in TAP.
     > RSS(Receive Side Scaling) is used to distribute network packets
     to guest virtqueues
     > by calculating packet hash.
     > eBPF RSS allows us to use RSS with vhost TAP.
     >
     > This set of patches introduces the usage of eBPF for packet steering
     > and RSS hash calculation:
     > * RSS(Receive Side Scaling) is used to distribute network packets to
     > guest virtqueues by calculating packet hash
     > * eBPF RSS suppose to be faster than already existing 'software'
     > implementation in QEMU
     > * Additionally adding support for the usage of RSS with vhost
     >
     > Supported kernels: 5.8+
     >
     > Implementation notes:
     > Linux TAP TUNSETSTEERINGEBPF ioctl was used to set the eBPF program.
     > Added eBPF support to qemu directly through a system call, see the
     > bpf(2) for details.
     > The eBPF program is part of the qemu and presented as an array
     of bpf
     > instructions.
     > The program can be recompiled by provided Makefile.ebpf(need to
     adjust
     > 'linuxhdrs'),
     > although it's not required to build QEMU with eBPF support.
     > Added changes to virtio-net and vhost, primary eBPF RSS is used.
     > 'Software' RSS used in the case of hash population and as a
     fallback option.
     > For vhost, the hash population feature is not reported to the guest.
     >
     > Please also see the documentation in PATCH 6/6.
     >
     > I am sending those patches as RFC to initiate the discussions
     and get
     > feedback on the following points:
     > * Fallback when eBPF is not supported by the kernel


     Yes, and it could also a lacking of CAP_BPF.


     > * Live migration to the kernel that doesn't have eBPF support


     Is there anything that we needs special treatment here?

Possible case: rss=on, vhost=on, source system with kernel 5.8
(everything works) -> dest. system 5.6 (bpf does not work), the adapter
functions, but all the steering does not use proper queues.

Right, I think we need to disable vhost on dest.




     > * Integration with current QEMU build


     Yes, a question here:

     1) Any reason for not using libbpf, e.g it has been shipped with some
     distros


We intentionally do not use libbpf, as it present only on some distros.
We can switch to libbpf, but this will disable bpf if libbpf is not
installed

That's better I think.


     2) It would be better if we can avoid shipping bytecodes



This creates new dependencies: llvm + clang + ...
We would prefer byte code and ability to generate it if prerequisites
are installed.

It's probably ok if we treat the bytecode as a kind of firmware.
That is explicitly *not* OK for inclusion in Fedora. They require that
BPF is compiled from source, and rejected my suggestion that it could
be considered a kind of firmware and thus have an exception from building
from source.


Please refer what it was done in DPDK:

http://git.dpdk.org/dpdk/tree/doc/guides/nics/tap.rst#n235

I don't think what proposed here makes anything different.

It's still a bytecode that lives in an array.



But in the long run, it's still worthwhile consider the qemu source is used
for development and llvm/clang should be a common requirement for generating
eBPF bytecode for host.
So we need to do this right straight way before this merges.


Yes.

Thanks



Regards,
Daniel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]