guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: shepherd service works on host but fails inside system container


From: Vladilen Kozin
Subject: Re: shepherd service works on host but fails inside system container
Date: Wed, 22 Mar 2023 12:14:07 +0000

I now have a hypothesis as to what's happening. Could someone confirm or disprove and maybe suggest a solution or point at existing workarounds.

Host and container will share the exact same kernel that's unsurprisingly already running, so the above has nothing to do with kernel modules or settings. Fwiw I figured a way to find where kernel modules reside by doing:

$ sudo dmesg | grep -i "kernel command line"
which shows where current system is inside the store and relevant /lib/modules will be under it. We could then --expose=/gnu/store/hash-system/lib/modules=/lib/modules if we wanted to.

Real problem, IIUC, is with capabilities. Notion of "container" can be misleading and evokes thoughts of "vm" when in practice its just a process with some isolation applied to it. So, presently I'm guessing container Shepherd maybe PID 1 inside its isolated environment, but from the host pow it is just a process and one that unlike our host's shepherd may lack certain capabilities and privileges to e.g. create new devices or load kernel modules on request, etc. In the sense of https://man7.org/linux/man-pages/man7/capabilities.7.html maybe?

Am I on the right track? But then, how does one test services like that that may require ability to modify devices etc? Have we "outgrown" container and ought to `guix system vm` for such services? Or is there a way to bless container shepherd with necessary capabilities? If not from `guix system container` command line, then perhaps dropping down to the underlying programmatic interface i.e. whatever `guix system container` ends up calling to containerize a system? 

Thanks


On Wed, 22 Mar 2023 at 10:20, Vladilen Kozin <vladilen.kozin@gmail.com> wrote:
Hello guix.

I put together a tailscale system service that's meant to start a tailscale daemon managed by the system shepherd, that is to say that my `tailscaled-service-type` specifies `(service-extension shepherd-root-service-type tailscaled-shepherd-service)`, where `tailscaled-shepherd-service` creates a `shepherd-service` with (provision '(tailscaled)) and (requirement '(networking)).

I tested it by lowering to store via `shepherd-service-file` and then loading the generated script via `sudo herd load root ...`. This works fine and the daemon starts without a problem.

Next, I try to spawn tailscaled as part of my OS definition:
(services (cons* (service tailscaled-service-type (tailscaled-configuration)) %base-services))
;; tried %desktop-services too

To test, we create a container:
sudo guix system -K -L /home/vlad/Code/fullmeta-guix/channel container os.scm --network --expose=/dev/net=/dev

Earlier runs had it complaining that /dev/net/tun was missing, so I exposed that. Dunno if that's how I'm supposed to handle this. Now, /var/log/messages show:

Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 Linux kernel version: 5.18.10
Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 is CONFIG_TUN enabled in your kernel? `modprobe tun` failed with:
Mar 22 09:38:48 twgter shepherd[1]: [tailscaled] 2023/03/22 09:38:48 wgengine.NewUserspaceEngine(tun "tailscale0") error: tstun.New("tailscale0"): operation not permitted

I feel like maybe I'm missing some kernel modules, but I would've expected host and container to share the kernel, so I dunno. In fact, when I randomly attempted adding (kernel-arguments (cons* "CONFIG_TUN=m" %default-kernel-arguments)) to my os definition, resulting script hash came out the same, which tells me, containers don't even look at these kernel params when generating a script.

Any guesses as to why this works under host but not inside container?

Relatedly, does anyone have a nicer workflow they use to define and test shepherd services? Such containerization was the next step in testing the service and would've been ok were it not for the above failure, but the initial indirection with lowering to store, then `sudo herd load root ...` is a bit too involved and "indirect" for my liking as well - anyone has an improved way of developing shepherd services?

Thanks!
--
Best regards
Vlad Kozin


--
Best regards
Vlad Kozin

reply via email to

[Prev in Thread] Current Thread [Next in Thread]