[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Making QEMU easier for management tools and applications

From: Daniel P . Berrangé
Subject: Re: Making QEMU easier for management tools and applications
Date: Thu, 23 Jan 2020 10:27:05 +0000
User-agent: Mutt/1.12.1 (2019-06-15)

On Wed, Jan 22, 2020 at 05:42:10PM -0500, John Snow wrote:
> On 12/24/19 8:00 AM, Daniel P. Berrangé wrote:
> > Based on experiance in libvirt, this is an even larger job than (4),
> > as the feature set here is huge.  Much of it directly ties into the
> > config problem, as to deal with SELinux / namespace setup the code
> > needs to understand what resources to provide access to. This
> > requires a way to express 100% coverage of all QEMU configuration
> > in use & analyse it to determine what resources it implies. So this
> > ties strongly into QAPI-ification completion.
> Is it totally bonkers to suggest that QEMU provide a method of digesting
> a given configuration and returning a configuration object that a
> standalone jailer can use?
> So we have a QEMU manager, the generic jailer, and QEMU. QEMU and the
> manager cooperate to produce the jailing configuration, and the jailer
> does what we ask it to.

It isn't clear what you mean by "QEMU" here. If this QEMU, the system
emulator process, then this is the untrustworthy part of the stack,
so the jailer must not use any data that QEMU is providing. In fact
during startup the jailer does its work before QEMU even exists.

There are aspects to the confinement that use / rely on knowledge that
QEMU doesn't normally have, or are expressed in a different way that
which QEMU uses, or needs to take a different imlpementation approach to
that which QEMU normally has.

For networking, for example, from QEMU's config POV, there's just a
TAP file descriptor. There are then a huge number of ways in which
that TAP FD has been connected to the network in the host that are
invisible to QEMU. Plain bridge, openvswitch bridge, macvtap device
all with varying configs. Knowledge of this is relevant to the manager
process and the jailer but irrelevant to QEMU.

When configuring disks we have technical issues. For example we need
to identify the full backing chain and grant the appropriate permissions
on this. Even if there was a libqemublock.so, libvirt would not use this
because the QEMU storage code design is not reliable & minimal enough.
For example to just query the backing file, QEMU opens the qcow2 and
parses all the data about it, building up L1/L2 tables, and other
data structures involved. It is trivial to create qcow2 files which
result in both memory and CPU denial of service merely from opening
the file.  Libvirt's approach to this is minimalist just having a
data table of offsets to the key fields in each file format. So we
can extract the backing file & its format without reading anything
else from the disk.

When configuring chardevs there is a choice of how to do it - we
could just pass the UNIX socket path in, or we could create the
UNIX socket ourselves & pass in the pre-opened FD. Both are equally
functional from QEMU's POV and the end user's POV, but passing a
pre-opened FD is more convenient for libvirt's needs as it allowed
for race-free startups sychronization between libvirt & QEMU, or
rather QMP.  The different options here though, have different
needs on the jailer, because extra steps are needed when passing
pre-opened FD to get the SELinux labelling right. QEMU doesn't
know which approach the mgmt app will want to take, so we can't
ask QEMU how the jailer should be configured - the mgmt app needs
to make that decision.

Essentially we have 2 configuration formats - the high level one
that the mgmt app layer uses & the low level one that QEMU uses.
The component in the stack which maps between the two config
formats, is that one that has the knowledge to configure the
jailer. This isn't QEMU. It is whatever is immediately above QEMU,
currently libvirt, but something conceptually equivalent to the
role libvirt's QEMU driver impl fills.

|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

reply via email to

[Prev in Thread] Current Thread [Next in Thread]