qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] virtiofsd: Use clone() and not unshare(), support non-root


From: Colin Walters
Subject: [PATCH] virtiofsd: Use clone() and not unshare(), support non-root
Date: Fri, 01 May 2020 14:25:48 -0400
User-agent: Cyrus-JMAP/3.3.0-dev0-351-g9981f4f-fmstable-20200421v1

I'd like to make use of virtiofs as part of our tooling in
https://github.com/coreos/coreos-assembler
Most of the code runs as non-root today; qemu also runs as non-root.
We use 9p right now.

virtiofsd's builtin sandboxing effectively assumes it runs as
root.

First, change the code to use `clone()` and not `unshare()+fork()`.

Next, automatically use `CLONE_NEWUSER` if we're running as non root.

This is similar logic to that in https://github.com/containers/bubblewrap
(Which...BTW, it could make sense for virtiofs to depend on bubblewrap
 and re-exec itself rather than re-implementing the containerization
 itself)

Signed-off-by: Colin Walters <address@hidden>
---
 tools/virtiofsd/passthrough_ll.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c
index 4c35c95b25..468617f6d6 100644
--- a/tools/virtiofsd/passthrough_ll.c
+++ b/tools/virtiofsd/passthrough_ll.c
@@ -2530,6 +2530,21 @@ static void print_capabilities(void)
     printf("}\n");
 }
 
+/* Copied from bubblewrap */
+static int
+raw_clone(unsigned long flags, void *child_stack)
+{
+#if defined(__s390__) || defined(__CRIS__)
+  /*
+   * On s390 and cris the order of the first and second arguments
+   * of the raw clone() system call is reversed.
+   */
+    return (int) syscall(__NR_clone, child_stack, flags);
+#else
+    return (int) syscall(__NR_clone, flags, child_stack);
+#endif
+}
+
 /*
  * Move to a new mount, net, and pid namespaces to isolate this process.
  */
@@ -2547,14 +2562,15 @@ static void setup_namespaces(struct lo_data *lo, struct 
fuse_session *se)
      * an empty network namespace to prevent TCP/IP and other network
      * activity in case this process is compromised.
      */
-    if (unshare(CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET) != 0) {
-        fuse_log(FUSE_LOG_ERR, "unshare(CLONE_NEWPID | CLONE_NEWNS): %m\n");
-        exit(1);
+    int clone_flags = SIGCHLD | CLONE_NEWPID | CLONE_NEWNS | CLONE_NEWNET;
+    /* If we're non root, we need a new user namespace */
+    if (getuid() != 0) {
+        clone_flags |= CLONE_NEWUSER;
     }
 
-    child = fork();
+    child = raw_clone(clone_flags, NULL);
     if (child < 0) {
-        fuse_log(FUSE_LOG_ERR, "fork() failed: %m\n");
+        fuse_log(FUSE_LOG_ERR, "clone() failed: %m\n");
         exit(1);
     }
     if (child > 0) {
-- 
2.24.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]