bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#35550: Installer: wpa_supplicant fails to start


From: Ludovic Courtès
Subject: bug#35550: Installer: wpa_supplicant fails to start
Date: Mon, 06 May 2019 00:21:26 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux)

Ludovic Courtès <address@hidden> skribis:

> So why is ‘wpa-supplicant’ marked as failing to start on the first
> attempt?
>
> The only reason I can think of is if ‘read-pid-file’ from (shepherd
> service) returns immediately and returns #f instead of a number.  That
> can actually happen if the PID file exists but is empty (or contains
> garbage).

I’ve produced an ISO with the patch below and ran it on the bare metal
to get confirmation (too bad the bug doesn’t show in QEMU :-/).  Indeed,
‘read-pid-file’ for /var/run/wpa_supplicant.pid systematically reads the
empty string the first time the ‘wpa-supplicant’ service is started.

(After that, if we kill the process and try to restart the service, the
problem doesn’t show up.)

diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
index 53437b6..e21492e 100644
--- a/modules/shepherd/service.scm
+++ b/modules/shepherd/service.scm
@@ -717,9 +717,12 @@ otherwise return the number that was read (a PID)."
   (let loop ()
     (catch 'system-error
       (lambda ()
-        (string->number
-         (string-trim-both
-          (call-with-input-file file get-string-all))))
+        (define str
+          (call-with-input-file file get-string-all))
+
+        (local-output (l10n "read-pid-file ~s -> ~s")
+                      file str)
+        (string->number (string-trim-both str)))
       (lambda args
         (let ((errno (system-error-errno args)))
           (if (= ENOENT errno)
With the second patch below, I confirm that the ‘wpa-supplicant’ starts
correctly.  We can see in /var/log/messages that ‘read-pid-file’ first
reads the empty string from /var/run/wpa_supplicant.pid, then tries
again, and gets a valid PID on the second attempt.

It’s surprising that the timing is always like that, and only on the
bare metal, but that’s the way it is.

It’d be great if you could do some testing with the patch below.  Then I
guess we’ll push a Shepherd release with this fix.

I wonder if this could also explain
<https://issues.guix.info/issue/30993>.

Thanks,
Ludo’.

diff --git a/gnu/packages/admin.scm b/gnu/packages/admin.scm
index dfc3467bf8..e1dd248679 100644
--- a/gnu/packages/admin.scm
+++ b/gnu/packages/admin.scm
@@ -188,7 +188,8 @@ and provides a \"top-like\" mode (monitoring).")
                                   version ".tar.gz"))
               (sha256
                (base32
-                "1ys2w83vm62spr8bx38sccfdpy9fqmj7wfywm5k8ihsy2k61da2i"))))
+                "1ys2w83vm62spr8bx38sccfdpy9fqmj7wfywm5k8ihsy2k61da2i"))
+              (patches (search-patches "shepherd-debug.patch"))))
     (build-system gnu-build-system)
     (arguments
      '(#:configure-flags '("--localstatedir=/var")))
diff --git a/gnu/packages/patches/shepherd-debug.patch 
b/gnu/packages/patches/shepherd-debug.patch
new file mode 100644
index 0000000000..2fd97cc578
--- /dev/null
+++ b/gnu/packages/patches/shepherd-debug.patch
@@ -0,0 +1,43 @@
+diff --git a/modules/shepherd/service.scm b/modules/shepherd/service.scm
+index 53437b6..bef8f42 100644
+--- a/modules/shepherd/service.scm
++++ b/modules/shepherd/service.scm
+@@ -715,21 +715,28 @@ number.  Return #f if FILE was not created or does not 
contain a number;
+ otherwise return the number that was read (a PID)."
+   (define start (current-time))
+   (let loop ()
++    (define (retry)
++      (and (< (current-time) (+ start max-delay))
++           (begin
++             ;; FILE does not exist yet, so wait and try again.
++             ;; XXX: Ideally we would yield to the main event loop
++             ;; and/or use inotify.
++             (sleep 1)
++             (loop))))
++
+     (catch 'system-error
+       (lambda ()
+-        (string->number
+-         (string-trim-both
+-          (call-with-input-file file get-string-all))))
++        (define str
++          (call-with-input-file file get-string-all))
++
++        (local-output (l10n "read-pid-file ~s -> ~s")
++                      file str)
++        (or (string->number (string-trim-both str))
++            (retry)))
+       (lambda args
+         (let ((errno (system-error-errno args)))
+           (if (= ENOENT errno)
+-              (and (< (current-time) (+ start max-delay))
+-                   (begin
+-                     ;; FILE does not exist yet, so wait and try again.
+-                     ;; XXX: Ideally we would yield to the main event loop
+-                     ;; and/or use inotify.
+-                     (sleep 1)
+-                     (loop)))
++              (retry)
+               (apply throw args)))))))
+ 
+ (define* (exec-command command

reply via email to

[Prev in Thread] Current Thread [Next in Thread]