[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 3/5] qmp: Added the helper stamp check.
From: |
Andrew Melnichenko |
Subject: |
Re: [PATCH 3/5] qmp: Added the helper stamp check. |
Date: |
Wed, 22 Mar 2023 15:26:59 +0200 |
Hi all,
I've researched an issue a bit. The solution with passing eBPF blob
and loading in the Libvirt looks promising.
Overall, the possible solution looks like this:
* Libvirt checks virtio-net properties and understands that eBPF
steering may be required.
* Libvirt requests eBPF blob through QMP.
* Libvirt loads blob for virtio-net and passes fds from eBPF to QEMU.
I think that it's a good idea to pass only eBPF blob without
additional metainformation. Most metainfo that we need could be
retrieved from eBPF blob, and the only question is to pass fds
sequence to QEMU.
I propose to pass them as they appear in the blob itself, like
"virtio-net-pci,ebpf_rss_fds=<prog>,<map1>,<map2>,<map3>...".
Also, I think it's a good idea to make a "general" QMP request for
eBPF blobs. Something like "request_ebpf <arg>"(g.e "request_ebpf
virtio-net-rss").
I'll prepare new RFC patches if you have questions or something to
discuss, please let me know.
On Thu, Mar 2, 2023 at 12:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Daniel P. Berrangé <berrange@redhat.com> writes:
>
> > On Wed, Mar 01, 2023 at 03:53:47PM +0100, Toke Høiland-Jørgensen wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >>
> >> > On Tue, Feb 28, 2023 at 11:21:56PM +0100, Toke Høiland-Jørgensen wrote:
> >> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> >>
> >> >> > On Tue, Feb 28, 2023 at 08:01:51PM +0100, Toke Høiland-Jørgensen
> >> >> > wrote:
> >> >> >> Daniel P. Berrangé <berrange@redhat.com> writes:
> >> >> >>
> >> >> >> Just to interject a note on this here: the skeleton code is mostly a
> >> >> >> convenience feature used to embed BPF programs into the calling
> >> >> >> binary.
> >> >> >> It is perfectly possible to just have the BPF object file itself
> >> >> >> reside
> >> >> >> directly in the file system and just use the regular libbpf APIs to
> >> >> >> load
> >> >> >> it. Some things get a bit more cumbersome (mostly setting values of
> >> >> >> global variables, if the BPF program uses those).
> >> >> >>
> >> >> >> So the JSON example above could just be a regular compiled-from-clang
> >> >> >> BPF object file, and the management program can load that, inspect
> >> >> >> its
> >> >> >> contents using the libbpf APIs and pass the file descriptors on to
> >> >> >> Qemu.
> >> >> >> It's even possible to embed version information into this so that
> >> >> >> Qemu
> >> >> >> can check if it understands the format and bail out if it doesn't -
> >> >> >> just
> >> >> >> stick a version field in the configuration map as the first entry :)
> >> >> >
> >> >> > If all you have is the BPF object file is it possible to interrogate
> >> >> > it to get a list of all the maps, and get FDs associated for them ?
> >> >> > I had a look at the libbpf API and wasn't sure about that, it seemed
> >> >> > like you had to know the required maps upfront ? If it is possible
> >> >> > to auto-discover everything you need, soley from the BPF object file
> >> >> > as input, then just dealing with that in isolation would feel simpler.
> >> >>
> >> >> It is. You load the object file, and bpf_object__for_each_map() lets you
> >> >> discover which maps it contains, with the different bpf_map__*() APIs
> >> >> telling you the properties of that map (and you can modify them too
> >> >> before loading the object if needed).
> >> >>
> >> >> The only thing that's not in the object file is any initial data you
> >> >> want to put into the map(s). But except for read-only maps that can be
> >> >> added by userspace after loading the maps, so you could just let Qemu do
> >> >> that...
> >> >>
> >> >> > It occurrs to me that exposing the BPF program as data rather than
> >> >> > via binary will make more practical to integrate this into KubeVirt's
> >> >> > architecture. In their deployment setup both QEMU and libvirt are
> >> >> > running unprivileged inside a container. For any advanced nmetworking
> >> >> > a completely separate component creates the TAP device and passes it
> >> >> > into the container running QEMU. I don't think that the separate
> >> >> > precisely matched helper binary would be something they can use, but
> >> >> > it might be possible to expose a data file providing the BPF program
> >> >> > blob and describing its maps.
> >> >>
> >> >> Well, "a data file providing the BPF program blob and describing its
> >> >> maps" is basically what a BPF .o file is. It just happens to be encoded
> >> >> in ELF format :)
> >> >>
> >> >> You can embed it into some other data structure and have libbpf load it
> >> >> from a blob in memory as well as from the filesystem, though; that is
> >> >> basically what the skeleton file does (notice the big character string
> >> >> at the end, that's just the original .o file contents).
> >> >
> >> > Ok, in that case I'm really wondering why any of this helper program
> >> > stuff was proposed. I recall the rationale was that it was impossible
> >> > for an external program to load the BPF object on behalf of QEMU,
> >> > because it would not know how todo that without QEMU specific
> >> > knowledge.
> >>
> >> I'm not sure either. Was there some bits that initially needed to be set
> >> before the program was loaded (read-only maps or something)? Also,
> >> upstream does encourage the use of skeletons for embedding into
> >> applications, so it's not an unreasonable thing to start with if you
> >> don't have the kind of deployment constraints that Qemu does in this
> >> case.
> >>
> >> > It looks like we can simply expose the BPF object blob to mgmt apps
> >> > directly and get rid of this helper program entirely.
> >>
> >> I believe so, yes. You'd still need to be sure that the BPF object file
> >> itself comes from a trusted place, but hopefully it should be enough to
> >> load it from a known filesystem path? (Sorry if this is a stupid
> >> question, I only have a fuzzy idea of how all the pieces fit together
> >> here).
> >
> > It could be from a well known location on the filesystem, but might
> > be better to make it possible to query it from QMP, which is mostly
> > safe *provided* you've not yet started guest CPUs running. It could
> > be queried at startup and then cached for future use.
>
> Right, I don't have a strong opinion about the exact mechanism, just
> wanted to convey a general "loading an untrusted BPF program is bad"
> kind of vibe ;)
>
> -Toke
>