qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCHv2 3/4] Support for "double whitelist" filters


From: Corey Bryant
Subject: Re: [Qemu-devel] [PATCHv2 3/4] Support for "double whitelist" filters
Date: Mon, 05 Nov 2012 09:39:46 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121009 Thunderbird/16.0



On 11/02/2012 06:14 PM, Paul Moore wrote:
On Friday, November 02, 2012 06:00:29 PM Corey Bryant wrote:
On 11/02/2012 05:29 PM, Paul Moore wrote:
On Tuesday, October 23, 2012 03:55:31 AM Eduardo Otubo wrote:
This patch includes a second whitelist right before the main loop. It's
a smaller and more restricted whitelist, excluding execve() among many
others.

v2: * ctx changed to main_loop_ctx

      * seccomp_on now inside ifdef
      * open syscall added to the main_loop whitelist

Signed-off-by: Eduardo Otubo <address@hidden>

Unfortunately qemu.org seems to be down for me today so I can't grab the
latest repo to review/verify this patch (some of my comments/assumptions
below may be off) but I'm a little confused, hopefully you guys can help
me out, read below ...

The first call to seccomp_install_filter() will setup a whitelist for the
syscalls that have been explicitly specified, all others will hit the
default action TRAP/KILL.  The second call to seccomp_install_filter()
will add a second whitelist for another set of explicitly specified
syscalls, all others will hit the default action TRAP/KILL.

That's correct.  The goal was to have a 2nd list that is a subset of the
1st list, and also not include execve() in the 2nd list.  At this point
though, since it's late in the release, we've expanded the 2nd list to
be the same as the 1st with the exception of execve() not being in the
2nd list.

The problem occurs when the filters are executed in the kernel when a
syscall is executed.  On each syscall the first filter will be executed
and the action will either be ALLOW or TRAP/KILL, next the second filter
will be executed and the action will either be ALLOW or TRAP/KILL; since
the kernel always takes the most restrictive (lowest integer action
value) action when multiple filters are specified, I think your double
whitelist value is going to have some inherent problems.

That's something I hadn't thought of.  But TRAP and KILL won't exist
together in our whitelists, and our 2nd whitelist is a subset of the
1st.  So do you think there would still be problems?

It doesn't really matter if the default action is TRAP and/or KILL, the point
is that if you use a second whitelist after an initial whitelist the effective
seccomp filter is going to be only the syscalls you explicitly allowed in the
second whitelist.  When using multiple seccomp filters on a process, all
filters are executed for each syscall and the most restrictive action of all
the filters is the action that the kernel takes.

Don't get me wrong, I like the idea of progressively restricting QEMU, but if
you are going to load multiple seccomp filters into the kernel, you almost
certainly only want the first whitelist filter to be the union of all the
seccomp filter you intend to load with all subsequent filters being blacklists
which progressively remove syscalls which are allowed by the initial
whitelist.


That's what we're doing though. The first whitelist is a union of all subsequent filters. Of course there's only one subsequent filter at this point. But the idea is to start out with a large whitelist for initialization and then tighten it up before the main loop when presumably less syscalls are needed.

My concern is getting the two whitelists correct. We keep uncovering new syscalls as we test.

I might suggest an initial, fairly permissive
whitelist followed by a follow-on blacklist if you want to disable certain
syscalls.

I have to admit I'm nervous about this at this point in QEMU 1.3.  It's
getting late in the cycle and we'd hoped to get this in earlier.  A more
permissive whitelist is probably going to be the only way we'll
successfully turn -sandbox on by default at this point in QEMU 1.3.

Thats fine, I just wanted to point out that I think the multiple whitelist
approach is going to have some inherent problems.


Are you thinking there will be problems with the current two-whitelist approach, or are you thinking there would be problems in the future if we continued restricting the QEMU process with further whitelists? If you mean the latter, then I understand your point since QEMU is a single process that requires a certain subset of syscalls.

I'm thinking once the two whitelists are in place, we can move on to restricting syscall parameters in the existing whitelists where it makes sense, and then look into your original decomposition approach, where parts of qemu are run in separate threads/processes which would allow much tighter seccomp restriction.

What do you think?

--
Regards,
Corey Bryant




reply via email to

[Prev in Thread] Current Thread [Next in Thread]