qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for cap


From: Erik Skultety
Subject: Re: [Qemu-devel] AMD SEV's /dev/sev permissions and probing QEMU for capabilities
Date: Wed, 30 Jan 2019 09:06:30 +0100
User-agent: Mutt/1.10.1 (2018-07-13)

On Tue, Jan 29, 2019 at 06:40:08PM +0000, Daniel P. Berrangé wrote:
> On Tue, Jan 29, 2019 at 05:15:42PM +0100, Erik Skultety wrote:
> > On Wed, Jan 23, 2019 at 03:02:28PM +0000, Singh, Brijesh wrote:
> > >
> > >
> > > On 1/23/19 7:36 AM, Daniel P. Berrangé wrote:
> > > > On Wed, Jan 23, 2019 at 02:33:01PM +0100, Erik Skultety wrote:
> > > >> On Wed, Jan 23, 2019 at 01:24:13PM +0000, Daniel P. Berrangé wrote:
> > > >>> On Wed, Jan 23, 2019 at 02:22:12PM +0100, Erik Skultety wrote:
> > > >>>> On Wed, Jan 23, 2019 at 01:10:42PM +0000, Daniel P. Berrangé wrote:
> > > >>>>>
> > > >>>>> It still looks like we can solve this entirely in libvirt by just 
> > > >>>>> giving
> > > >>>>> the libvirt capabilities probing code CAP_DAC_OVERRIDE. This would 
> > > >>>>> make
> > > >>>>> libvirt work for all currently released SEV support in kernel/qemu.
> > > >>>>
> > > >>>> Sure we can, but that would make libvirt the only legitimate user of 
> > > >>>> /dev/sev
> > > >>>> and everything else would require the admin to change the permissions
> > > >>>> explicitly so that other apps could at least retrieve the platform 
> > > >>>> info, if
> > > >>>> it was intended to be for public use?
> > > >>>> Additionally, we'll still get shot down by SELinux because svirt_t 
> > > >>>> wouldn't be
> > > >>>> able to access /dev/sev by default.
> > > >>>
> > > >>> That's separate form probing and just needs SELinux policy to define
> > > >>> a new  sev_device_t type and grant svirt_t access to it.
> > > >>
> > > >> I know, I misread "we can solve this entirely in libvirt" then, I 
> > > >> thought you
> > > >> the SELinux part was included in the statement, my bad then. Still, 
> > > >> back to the
> > > >> original issue, we could technically do both, libvirt would have run 
> > > >> qemu with
> > > >> CAP_DAC_OVERRIDE and we keep working with everything's been released 
> > > >> for
> > > >> SEV in kernel/qemu and for everyone else, systemd might add 0644 for 
> > > >> /dev/sev,
> > > >> that way, everyone's happy, not that I'd be a fan of libvirt often 
> > > >> having
> > > >> to work around something because projects underneath wouldn't backport 
> > > >> fixes to
> > > >> all the distros we support, thus leaving the dirty work to us.
> > > >
> > > > Setting 0644 for /dev/sev looks unsafe to me unless someone can show
> > > > where the permissions checks take place for the many ioctls that
> > > > /dev/sev allows, such that only SEV_PDH_CERT_EXPORT or 
> > > > SEV_PLATFORM_STATUS
> > > > is allowed when /dev/sev is opened by a user who doesn't have write
> > > > permissions.
> > > >
> > >
> > > Agree its not safe to do 0644.
> > >
> > > Currently, anyone who has access to /dev/sev (read or write) will be
> > > able to execute SEV platform command. In other words there is no
> > > permission check per command basis. I must admit that while developing
> > > the driver I was under assumption that only root will ever access the
> > > /dev/sev device hence overlooked it. But now knowing that others may
> > > also need to access the /dev/sev, I can submit patch in kernel to do
> > > per command access control.
> > >
> > > Until then, can we follow Daniel's recommendation to elevate privilege
> > > of the probing code?
> >
> > So, sorry for not coming back earlier, but I'm still fighting a permission
> > issue when opening the /dev/sev device and I honestly don't know what's
> > happening. If you apply the patch bellow (or attachment) and you run 
> > libvirtd
> > with LIBVIRT_LOG_OUTPUTS=1:file:<your_log_file> env, you should see 
> > something
> > like this in the logs:
> >
> > warning : virExec:778 : INHERITABLE CAPS: 'dac_override'
> > warning : virExec:781 : EFFECTIVE CAPS: 'dac_override'
> > warning : virExec:784 : PERMITTED CAPS: 'dac_override'
> > warning : virExec:787 : BOUNDING CAPS: 'dac_override'
> >
> > ...and that is right before we issue execve. Yet, if you debug further into 
> > the
> > QEMU process right after execve, it doesn't report any capabilities at all, 
> > so
> > naturally it'll still get permission denied. Is there something I'm missing
> > here?
>
> Arrrrggghh.  We're missing the painful thing that I re-learn the hard way
> every time we play with capabilities in libvirt.....
>
> *Even* if you have set INHERITABLE capabilities, they are discarded
> on execve(), unless the binary you are exec'ing has file capabilities
> set that match those you are trying to give it. Of course QEMU binaries
> (and almost all binaries) lack such file capabilities.

Thanks for ^this bit which helped me understand the bits below. When I read the
man page yesterday the first question was, okay, how do I figure out whether
the file capabilities bit is set? Well, use xattrs...which didn't return
anything, so I was puzzled what exactly it should look like, but now that you
explained that most binaries actually lack the file capabilities, I see the
issue clearly :).

>
> The capabilities man page does in fact explain this
>
> [quote]
>         P’(permitted) = (P(inheritable) & F(inheritable)) |
>                         (F(permitted) & cap_bset)
>
>         P’(effective) = F(effective) ? P’(permitted) : 0
>
> where
>
>         P         denotes the value of a thread capability set before the 
> execve(2)
>
>         P'        denotes the value of a thread capability set after the 
> execve(2)
>
>         F         denotes a file capability set
>
> If ... real user ID of the process is 0 (root) then the file inheritable
> and permitted sets are defined to be all ones
> [/quote]
>
> So this means your code works perfectly *if*  we are executing
> QEMU as root, but if we are executing QEMU as non-root then
> all our work is for nothing.
>
> Now I have actually intentionally mis-quoted from an old man
> page here. In Linux 4.3 the kernel devs invented a new set of
> "ambient" capabilities [1] which allow programs to deal with this
> insanely annoying behaviour.

Right, now the ambient stuff makes sense, because at first I didn't understand
what the hell that was about after reading the man page.

>
> [quote]
>         P'(ambient)     = (file is privileged) ? 0 : P(ambient)
>
>         P'(permitted)   = (P(inheritable) & F(inheritable)) |
>                           (F(permitted) & cap_bset) | P'(ambient)
>
>         P'(effective)   = F(effective) ? P'(permitted) : P'(ambient)
> [/quote]
>
> so if we added CAP_DAC_OVERRIDE to the ambient capabilities set,
> then your patch would work.
>
> We can do that with this patch:
>
> diff --git a/src/util/virutil.c b/src/util/virutil.c
> index 5251b66454..be83e91fee 100644
> --- a/src/util/virutil.c
> +++ b/src/util/virutil.c
> @@ -1601,6 +1601,12 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t 
> *groups, int ngroups,
>          goto cleanup;
>      }
>
> +    for (i = 0; i <= CAP_LAST_CAP; i++) {
> +        if (capBits & (1ULL << i)) {
> +            prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0);
> +        }
> +    }

I'll squash this in and give it another test.

> +
>      ret = 0;
>   cleanup:
>      return ret;
>
>
> though, we need a #ifdef check for existance of PR_CAP_AMBIENT
>
> > An alternative question I've been playing ever since we exchanged the last 
> > few
> > emails is that can't we wait until the ioctls are compared against 
> > permissions
> > in kernel so that upstream libvirt (and downstream too for that matter) 
> > doesn't
> > have to work around it and stick with that workaround for eternity?
>
> IIUC, the SEV feature has already shipped with distros, so we'd effectively
> be saying that what we already shipped is unusable to libvirt. This doesn't
> feel like a desirable story to me.

It was, but it never worked, it always has been broken in this way. When we
were merging this upstream, we had a terrible shortage of machines and we had
to share, so the first person to provision the machine had already taken care
of the permissions in order to test so that led to this issue having been
overlooked until now. If it ever worked as expected and then we broke it, then
any fix from our side would make sense but otherwise I believe we should fix
this bottom up.

Thanks,
Erik



reply via email to

[Prev in Thread] Current Thread [Next in Thread]