qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in


From: Blue Swirl
Subject: Re: [Qemu-devel] [RFC] [PATCHv2 2/2] Adding basic calls to libseccomp in vl.c
Date: Tue, 3 Jul 2012 19:15:55 +0000

On Mon, Jul 2, 2012 at 6:05 PM, Corey Bryant <address@hidden> wrote:
>
>
> On 06/28/2012 03:49 PM, Blue Swirl wrote:
>>
>> On Wed, Jun 27, 2012 at 9:25 PM, Anthony Liguori <address@hidden>
>> wrote:
>>>
>>> On 06/21/2012 03:04 AM, Avi Kivity wrote:
>>>>
>>>>
>>>> On 06/19/2012 09:58 PM, Blue Swirl wrote:
>>>>>>>
>>>>>>>
>>>>>>> At least qemu-ifup/down scripts, migration exec and smbd have been
>>>>>>> mentioned. Only the system calls made by smbd (for some version of
>>>>>>> it)
>>>>>>> can be known. The user could specify arbitrary commands for the
>>>>>>> others, those could be assumed to use some common (large) subset of
>>>>>>> system calls but I think the security value would be close to zero
>>>>>>> then.
>>>>>>
>>>>>>
>>>>>>
>>>>>> We're not trying to protect against the user, but against the guest.
>>>>>> If
>>>>>> we assume the user wrote those scripts with care so they cannot be
>>>>>> exploited by the guest, then we are okay.
>>>>>
>>>>>
>>>>>
>>>>> My concern was that first we could accidentally filter a system call
>>>>> that changes the script or executable behavior, much like sendmail +
>>>>> capabilities bug, and then a guest could trigger running this
>>>>> script/executable and exploit the changed behavior.
>>>>
>>>>
>>>>
>>>> Ah, I see.  I agree this is dangerous.  We should probably disable exec
>>>> if we seccomp.
>>>
>>>
>>>
>>> There's no great place to jump into this thread so I guess I'll do it
>>> here.
>>>
>>> There is absolutely no doubt that white-listing syscalls that we
>>> currently
>>> use provides an improvement in security.
>>>
>>> We need to assume:
>>>
>>> 1) QEMU is run as an unprivileged user
>>>
>>> 2) QEMU is already heavily restricted by SELinux
>>>
>>> In this case, seccomp() is not being used to replace MAC or DAC.  It's
>>> supplementing both of them by additionally filtering out syscalls that
>>> may
>>> have unknown kernel exploits in them.  That's all this initial effort is
>>> about. Since it's scope is so limited, we can simply enable it
>>> unconditionally too.
>>
>>
>> I don't think the scope is limited in a safe way. What is the set of
>> system calls that can't ever cause problems to any possible ifup/down
>> scripts, migration exec helpers and various versions of smbd?
>>
>> For example, unlink() is missing. What if the ifup/down script needs
>> it for lock file cleanup? ftruncate()? Every socket syscalls in case
>> LDAP is used to access user information by the libc?
>>
>> I think we can't define the safe set, except 'allow all'. I'd propose
>> one of the following to avoid breakage:
>>
>> 1. Allow all system calls for the initial patch, refactor later to
>> reduce the set. Useless until refactored.
>
>
> One thing I like about starting with a known subset of syscalls used by QEMU
> is that it forces us to expand the whitelist if we come across more syscalls
> that QEMU uses.

Finding out what QEMU uses is the relatively easy part. Finding out
what the external helpers might use seems to be impossible.

>
> An issue with this approach is that if seccomp kills QEMU for using a
> disallowed syscall, I don't think we know what syscall it is.  (At least, I
> don't think it is accessible anywhere.)  This is good for security but makes
> it hard for developers who are debugging.
>
> Would it make sense to have the ability to configure QEMU in either:
> 1) seccomp kill mode (this is what the existing patches do), or
> 2) seccomp debug mode?
>
> In debug mode we could trap on the failing syscall (using SCMP_ACT_TRAP),
> determine the syscall value, and issue an error message that displays the
> syscall value.

I think that it would be nice and it would be useful also after any refactoring.

>
> The emulator() function here gives an idea of how this could be done:
> https://lkml.org/lkml/2012/4/12/449
>
>
>>
>> 2. Don't make seccomp mode enabled default, when enabled, forbid
>> execve(). Limits functionality when enabled, no security benefit if
>> not enabled.
>>
>> 3. Before enabling seccomp, fork a helper process without restrictions
>> that is used to launch other programs. Needs some work.
>>
>>>
>>> After we have this initial support, then we can look at a -sandbox
>>> option.
>>>   This open could prevent things like open()/execve() but that will come
>>> at a
>>> cost of features.
>>>
>>> I think the reasonable thing to do for -sandbox is to basically focus on
>>> the
>>> set of syscalls that QEMU would use if it were launched under libvirt.
>>> We
>>> should obviously make improvements (things like -blockdev) to make this
>>> even
>>> more restrictive.
>>>
>>> Who knows, maybe we end up having multiple types of sandboxes.  A
>>> '-sandbox
>>> libvirt' and a '-sandbox user' where the later is focused on the typical
>>> usage of an unprivileged user.
>>>
>>> But this is all stuff that can come later.  We solve a big problem by
>>> just
>>> getting the initial whitelist support in.
>>
>>
>> Fully agree, but we'd have to agree about what is a safe initial
>> whitelist.
>>
>>>
>>> Regards,
>>>
>>> Anthony Liguori
>>>
>>>
>>>>
>>>>>>
>>>>>> We have decomposed qemu to some extent, in that privileged operations
>>>>>> happen in libvirt.  So the modes make sense - qemu has no idea whether
>>>>>> a
>>>>>> privileged management system is controlling it or not.
>>>>>
>>>>>
>>>>>
>>>>> So with -seccomp, libvirt could tell QEMU that for example open(),
>>>>> execve(), bind() and connect() will never be needed?
>>>>
>>>>
>>>>
>>>> Yes.
>>>>
>>>
>>
>
> --
> Regards,
> Corey
>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]