qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net


From: Daniel P . Berrangé
Subject: Re: [RFC PATCH 0/6] eBPF RSS support for virtio-net
Date: Wed, 4 Nov 2020 12:04:15 +0000
User-agent: Mutt/1.14.6 (2020-07-11)

On Wed, Nov 04, 2020 at 01:49:05PM +0200, Yuri Benditovich wrote:
> On Wed, Nov 4, 2020 at 4:08 AM Jason Wang <jasowang@redhat.com> wrote:
> 
> >
> > On 2020/11/3 下午6:32, Yuri Benditovich wrote:
> > >
> > >
> > > On Tue, Nov 3, 2020 at 11:02 AM Jason Wang <jasowang@redhat.com
> > > <mailto:jasowang@redhat.com>> wrote:
> > >
> > >
> > >     On 2020/11/3 上午2:51, Andrew Melnychenko wrote:
> > >     > Basic idea is to use eBPF to calculate and steer packets in TAP.
> > >     > RSS(Receive Side Scaling) is used to distribute network packets
> > >     to guest virtqueues
> > >     > by calculating packet hash.
> > >     > eBPF RSS allows us to use RSS with vhost TAP.
> > >     >
> > >     > This set of patches introduces the usage of eBPF for packet
> > steering
> > >     > and RSS hash calculation:
> > >     > * RSS(Receive Side Scaling) is used to distribute network packets
> > to
> > >     > guest virtqueues by calculating packet hash
> > >     > * eBPF RSS suppose to be faster than already existing 'software'
> > >     > implementation in QEMU
> > >     > * Additionally adding support for the usage of RSS with vhost
> > >     >
> > >     > Supported kernels: 5.8+
> > >     >
> > >     > Implementation notes:
> > >     > Linux TAP TUNSETSTEERINGEBPF ioctl was used to set the eBPF
> > program.
> > >     > Added eBPF support to qemu directly through a system call, see the
> > >     > bpf(2) for details.
> > >     > The eBPF program is part of the qemu and presented as an array
> > >     of bpf
> > >     > instructions.
> > >     > The program can be recompiled by provided Makefile.ebpf(need to
> > >     adjust
> > >     > 'linuxhdrs'),
> > >     > although it's not required to build QEMU with eBPF support.
> > >     > Added changes to virtio-net and vhost, primary eBPF RSS is used.
> > >     > 'Software' RSS used in the case of hash population and as a
> > >     fallback option.
> > >     > For vhost, the hash population feature is not reported to the
> > guest.
> > >     >
> > >     > Please also see the documentation in PATCH 6/6.
> > >     >
> > >     > I am sending those patches as RFC to initiate the discussions
> > >     and get
> > >     > feedback on the following points:
> > >     > * Fallback when eBPF is not supported by the kernel
> > >
> > >
> > >     Yes, and it could also a lacking of CAP_BPF.
> > >
> > >
> > >     > * Live migration to the kernel that doesn't have eBPF support
> > >
> > >
> > >     Is there anything that we needs special treatment here?
> > >
> > > Possible case: rss=on, vhost=on, source system with kernel 5.8
> > > (everything works) -> dest. system 5.6 (bpf does not work), the
> > > adapter functions, but all the steering does not use proper queues.
> >
> >
> > Right, I think we need to disable vhost on dest.
> >
> >
> Is this acceptable to disable vhost at time of migration?
> 
> 
> > >
> > >
> > >
> > >     > * Integration with current QEMU build
> > >
> > >
> > >     Yes, a question here:
> > >
> > >     1) Any reason for not using libbpf, e.g it has been shipped with some
> > >     distros
> > >
> > >
> > > We intentionally do not use libbpf, as it present only on some distros.
> > > We can switch to libbpf, but this will disable bpf if libbpf is not
> > > installed
> >
> >
> > That's better I think.
> >
> 
> We think the preferred way is to have an eBPF code built-in in QEMU (not
> distribute it as a separate file).
> 
> Our initial idea was to not use the libbpf because it:
> 1. Does not create additional dependency during build time and during
> run-time
> 2. Gives us smaller footprint of loadable eBPF blob inside qemu
> 3. Do not add too much code to QEMU
> 
> We can switch to libbpf, in this case:
> 1. Presence of dynamic library is not guaranteed on the target system

Again if a distro or users wants to use this feature in
QEMU they should be expected build the library.

> 2. Static library is large

QEMU doesn't support static linking for system emulators.  It may
happen to work at times but there's no expectations in this respect.

> 3. libbpf uses eBPF ELF which is significantly bigger than just the array
> or instructions (May be we succeed to reduce the ELF to some suitable size
> and still have it built-in)
> 
> Please let us know whether you still think libbpf is better and why.

It looks like both CLang and GCC compilers for BPF are moving towards
a world where they use BTF to get compile once, run everywhere portability
for the compiled bytecode. IIUC the libbpf is what is responsible for
processing the BTF data when loading it into the running kernel. This
all looks like a good thing in general. 

If we introduce BPF to QEMU without using libbpf, and then later decide
we absolutely need libbpf features, it creates an upgrade back compat
issue for existing deployments. It is better to use libbpf right from
the start, so we're set up to take full advantage of what it offers
long term.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




reply via email to

[Prev in Thread] Current Thread [Next in Thread]