qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Virtio-fs] [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) a


From: Vivek Goyal
Subject: Re: [Virtio-fs] [PATCH for-5.1 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error
Date: Thu, 23 Jul 2020 09:56:03 -0400

On Thu, Jul 23, 2020 at 01:50:35PM +0100, Daniel P. Berrangé wrote:
> On Thu, Jul 23, 2020 at 01:46:11PM +0100, Stefan Hajnoczi wrote:
> > On Wed, Jul 22, 2020 at 06:03:28PM +0100, Daniel P. Berrangé wrote:
> > > On Wed, Jul 22, 2020 at 02:02:06PM +0100, Stefan Hajnoczi wrote:
> > > > An assertion failure is raised during request processing if
> > > > unshare(CLONE_FS) fails. Implement a probe at startup so the problem can
> > > > be detected right away.
> > > > 
> > > > Unfortunately Docker/Moby does not include unshare in the seccomp.json
> > > > list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always
> > > > include unshare (e.g. podman is unaffected):
> > > > https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json
> > > > 
> > > > Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the
> > > > default seccomp.json is missing unshare.
> > > > 
> > > > Cc: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
> > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > > > ---
> > > >  tools/virtiofsd/fuse_virtio.c | 13 +++++++++++++
> > > >  1 file changed, 13 insertions(+)
> > > > 
> > > > diff --git a/tools/virtiofsd/fuse_virtio.c 
> > > > b/tools/virtiofsd/fuse_virtio.c
> > > > index 3b6d16a041..ebeb352514 100644
> > > > --- a/tools/virtiofsd/fuse_virtio.c
> > > > +++ b/tools/virtiofsd/fuse_virtio.c
> > > > @@ -949,6 +949,19 @@ int virtio_session_mount(struct fuse_session *se)
> > > >  {
> > > >      int ret;
> > > >  
> > > > +    /*
> > > > +     * Test that unshare(CLONE_FS) works. fv_queue_worker() will need 
> > > > it. It's
> > > > +     * an unprivileged system call but some Docker/Moby versions are 
> > > > known to
> > > > +     * reject it via seccomp when CAP_SYS_ADMIN is not given.
> > > > +     */
> > > > +    ret = unshare(CLONE_FS);
> > > > +    if (ret == -1 && errno == EPERM) {
> > > > +        fuse_log(FUSE_LOG_ERR, "unshare(CLONE_FS) failed with EPERM. 
> > > > If "
> > > > +                "running in a container please check that the 
> > > > container "
> > > > +                "runtime seccomp policy allows unshare.\n");
> > > > +        return -1;
> > > > +    }
> > > > +
> > > 
> > > This describes the unshare() call as a "probe" and a "test", but that's
> > > misleading IMHO. A "probe" / "test" implies that after it has completed,
> > > there's no lingering side-effect, which isn't the case here.
> > > 
> > > This is actively changing the process' namespace environment in the
> > > success case, and not putting it back how it was originally.
> > > 
> > > May be this is in fact OK, but if so I think the commit message and
> > > comment should explain/justify what its fine to have this lingering
> > > side-effect.
> > > 
> > > If we want to avoid the side-effect then we need to fork() and run
> > > unshare() in the child, and use a check of exit status of the child
> > > to determine the result.
> > 
> > Thanks for pointing this out. I'll add a comment explaining that
> > virtiofsd is single-threaded at this point. No other threads share the
> > file system attributes so the call has no observable side-effects.
> 
> Also, if we do an unshare() here, do we still need the unshare() later
> on in the code ?

I think so. That unshare() later is to isolate one thread from other.

Thanks
Vivek




reply via email to

[Prev in Thread] Current Thread [Next in Thread]