[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal sche
From: |
megane |
Subject: |
[Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error |
Date: |
Fri, 30 Nov 2018 16:52:58 +0200 |
User-agent: |
mu4e 1.0; emacs 25.1.1 |
Hi,
Here's another version that crashes quickly with "very high
probability".
(cond-expand
(chicken-5 (import (chicken base))
(import (chicken time))
(import srfi-18))
(else (import chicken)
(use srfi-18)))
(define m (make-mutex))
(print "@@ " (current-thread) " " "lock")
(mutex-lock! m)
(define t (current-milliseconds))
(define (get-tosleep)
(/ (floor (* 1000 (- (+ t .030) (current-milliseconds)))) 1000))
(thread-start!
(make-thread (lambda ()
;; (thread-sleep! .01)
(print "@@ " (current-thread) " " "lock")
(let lp ()
(when (not (mutex-lock! m (get-tosleep)))
(thread-yield!)
(lp)))
(print "@@ " (current-thread) " " "unlock")
(mutex-unlock! m))))
(print "@@ " (current-thread) " " "sleep")
(thread-sleep! (get-tosleep))
(print "@@ " (current-thread) " " "unlock")
(mutex-unlock! m)
(thread-yield!)
(thread-sleep! .01)
(print "All ok!!")
--- typical output of a failing execution:
$ stdbuf -oL -eL ./t |& cat -n
1 @@ #<thread: primordial> lock
2 #<thread: primordial>: locking #<mutex>
3 @@ #<thread: primordial> sleep
4 #<thread: primordial> blocks for timeout 933.0
5 ==================== scheduling, current: #<thread: primordial>, ready:
(#<thread: thread1>)
6 timeout: #<thread: primordial> -> 933.0 (now: 904)
7 switching to #<thread: thread1>
8 @@ #<thread: thread1> lock
9 #<thread: thread1>: locking #<mutex>
10 #<thread: thread1> blocks for timeout 933.0
11 #<thread: thread1> sleeping on mutex mutex0
12 ==================== scheduling, current: #<thread: thread1>, ready: ()
13 timeout: #<thread: primordial> -> 933.0 (now: 904)
14 timeout: #<thread: primordial> -> 933.0 (now: 934)
15 timeout expired for #<thread: primordial>
16 unblocking: #<thread: primordial>
17 timeout: #<thread: thread1> -> 933.0 (now: 934)
18 timeout expired for #<thread: thread1>
19 unblocking: #<thread: thread1>
20 switching to #<thread: primordial>
21 @@ #<thread: primordial> unlock
22 #<thread: primordial>: unlocking mutex0
23
24 Error: (mutex-unlock) Internal scheduler error: unknown thread state
25 #<thread: thread1>
26 ready
27
28 Call history:
29
30 t.scm:27: chicken.base#print
31 t.scm:28: get-tosleep
32 t.scm:15: chicken.time#current-milliseconds
33 t.scm:15: scheme#floor
34 t.scm:15: scheme#/
35 t.scm:28: srfi-18#thread-sleep!
36 t.scm:29: srfi-18#current-thread
37 t.scm:29: chicken.base#print
38 t.scm:30: srfi-18#mutex-unlock! <--
(There's an extra debug message on line 15.
Add (dbg "timeout expired for " tto) in this true branch:
(if (>= now tmo1) ; timeout reached?
in ##sys#schedule)
--- The issue
mutex-unlock! makes the decision that a thread freed from
the mutex's waiting list cannot be in the 'ready state.
>From the output above you see a case how a thread waiting on a mutex
can end up being in the 'ready state.
line 2: The mutex is locked by primordial thread (pt)
line 4: The pt goes to sleep until 933.0
line 7: As the pt goes to sleep thread1 is scheduled to run
line 10: thread1 tries to lock the mutex, but sets a timeout that
happens to be at time 933.0
lines 12-14: Both threads asleep, time advances to 934
lines 15-16: pt gets put on the ready list
lines 17-19: thread1 gets put on the ready list
line 20: pt starts running
lines 21-22: pt executes mutex-unlock! while thread1 is ready to run
--- A fix
Just allow the 'ready state for threads in mutex-unlock!
In the patch I arbitrarily call ##sys#schedule after removing a thread
from the list, but I think doing nothing would work equally well.
Is this a correct fix?
Sorry, I can't help with that one..
Maybe it's possible there's threads on the waiting list, but the thread
that gets removed is not going to lock the mutex:
There are 3 threads in this scenario, A, B and C.
* A locks mutex
* A sleeps until t
* B tries to lock mutex until t
* C tries to lock mutex
* A and B are woken up at t
* A unlocks mutex, frees B
* B is scheduled to run as per the patch
* B finds out about the timeout, gives up and starts doing something else
* Now thread C is waiting on the mutex but no-one is going to free it!
diff -r 25ced70261b2 5/srfi-18/srfi-18.scm
--- a/5/srfi-18/srfi-18.scm Fri Nov 30 14:40:00 2018 +0200
+++ b/5/srfi-18/srfi-18.scm Fri Nov 30 16:26:19 2018 +0200
@@ -420,6 +420,7 @@
((blocked sleeping)
(##sys#setslot wt 11 #f)
(##sys#add-to-ready-queue wt))
+ ((ready) (##sys#schedule))
(else
(##sys#error 'mutex-unlock "Internal scheduler error: unknown
thread state"
wt wts))) ) )
diff -r 25ced70261b2 5/srfi-18/tests/issue-1564.scm
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/5/srfi-18/tests/issue-1564.scm Fri Nov 30 16:26:19 2018 +0200
@@ -0,0 +1,32 @@
+(cond-expand
+ (chicken-5 (import (chicken base))
+ (import (chicken time))
+ (import srfi-18))
+ (else (import chicken)
+ (use srfi-18)))
+
+(define m (make-mutex))
+
+(print "@@ " (current-thread) " " "lock")
+(mutex-lock! m)
+
+(define t (current-milliseconds))
+(define (get-time-to-sleep)
+ (/ (floor (* 1000 (- (+ t .030) (current-milliseconds)))) 1000))
+
+(thread-start!
+ (make-thread (lambda ()
+ (print "@@ " (current-thread) " " "lock")
+ (let lp ()
+ (when (not (mutex-lock! m (get-time-to-sleep)))
+ (thread-yield!)
+ (lp)))
+ (print "@@ " (current-thread) " " "unlock")
+ (mutex-unlock! m))))
+(print "@@ " (current-thread) " " "sleep")
+(thread-sleep! (get-time-to-sleep))
+(print "@@ " (current-thread) " " "unlock")
+(mutex-unlock! m)
+(thread-yield!)
+(thread-sleep! .01)
+(print "All ok!!")
diff -r 25ced70261b2 5/srfi-18/tests/run.scm
--- a/5/srfi-18/tests/run.scm Fri Nov 30 14:40:00 2018 +0200
+++ b/5/srfi-18/tests/run.scm Fri Nov 30 16:26:19 2018 +0200
@@ -1,5 +1,6 @@
(import (compile-file))
+(load "issue-1564.scm")
(load "simple-thread-test.scm")
(load "mutex-test.scm")
- [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error,
megane <=