[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#36609: 27.0.50; Possible race-condition in threading implementation

From: Pip Cet
Subject: bug#36609: 27.0.50; Possible race-condition in threading implementation
Date: Sat, 13 Jul 2019 14:37:25 +0000

On Fri, Jul 12, 2019 at 7:57 PM Eli Zaretskii <address@hidden> wrote:
> > I'm now convinced that there simply is no safe way to call select()
> > from two threads at once when using glib.
> I hope not, although GTK with its idiosyncrasies caused a lot of pain
> for the threads implementation in Emacs.

Well, I think we're going to have to do one or more of the following:

- have a race condition
- access glib-locked data from the "wrong" thread (another Emacs thread)
- release the glib lock from the "wrong" thread (another Emacs thread)

Of these, the second is the best alternative, I think: we simply grab
the g_main_context lock globally, acting for all Emacs threads, and
the last thread to require it releases it when it leaves xg_select. As
long as there's at least one thread in the critical section of
xg_select, we hold the lock, but access to the context isn't
necessarily from the thread which locked it.

> > I think our options are hacking around it and hoping nothing breaks
> > (this is what the attached patch does; it releases the main context
> > glib lock from the wrong thread soon "after" the other thread called
> > select, but there's actually no way to ensure that "after" is
> > accurate), or rewriting things so we have a single thread that does
> > all the select()ing.
> Hmm... how would this work with your patch?  Suppose one thread calls
> xg_select, acquires the Glib lock, sets its holding_glib_lock flag,
> then releases the global Lisp lock and calls pselect.  Since the
> global Lisp lock is now up for grabs, some other Lisp thread can
> acquire it and start running.

And when it starts running, it releases the Glib lock.

> If that other thread then calls
> xg_select, it will hang forever trying to acquire the Glib lock,
> because the first thread that holds it is stuck in pselect.

The first thread no longer holds the Glib lock, it was released when
we switched threads.

> I know very little about GTK and the Glib context lock, but AFAIR we
> really must treat that lock as a global one, not a thread-local one.
> So I think it's okay for one thread to take the Glib lock, and another
> to release it, because Glib just wants to know whether the "rest of
> the program" has it, it doesn't care which thread is that which holds
> the lock.

Okay, that sounds like option #2 above. The attached patch exposes
glib externals to the generic code, but it appears to work. If you
think the approach is okay, I'll move the glib-specific parts to
xgselect.c (if that's okay).

Attachment: glib-hack-002.diff
Description: Text Data

reply via email to

[Prev in Thread] Current Thread [Next in Thread]