[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Libunwind-devel] [patch] Fix for race in dwarf_find_save_locs

From: Paul Pluzhnikov
Subject: [Libunwind-devel] [patch] Fix for race in dwarf_find_save_locs
Date: Fri, 20 Nov 2009 15:45:09 -0800


Attached test case demonstrates a race in dwarf_find_save_locs:
we get 'rs' from cache (at T1), unlock the cache (T2), and use 'rs' to
"step" (T3, 'rs' points *inside* cache).

If between T2 and T3 another thread evicts 'rs' from cache, we use stale
data to "step" (and fail).

To trigger the bug, you must have many distinct code addresses (which is
why the test case is large) and several threads doing unwinds simultaneously.

Here are the symptoms I see at crash:

(gdb) run
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/".
[New Thread 0x40800950 (LWP 8534)]
[New Thread 0x41001950 (LWP 8535)]
Program received signal SIGABRT, Aborted.
[Switching to Thread 0x43806950 (LWP 8540)]
0x00007ffff7894095 in raise () from /lib/
(gdb) up 2
#2  0x000000000040080d in foo_6 () at foo.c:72
72          abort ();
(gdb) p n
$1 = 2
(gdb) p/a buf[0]
$2 = 0x401f9f <bar+54>
(gdb) p/a buf[1]
$3 = 0xe8ffffe892e8ffff    <<< bogus!

I see two ways to fix this: hold the cache locked while doing apply_reg_state
(fix#1), or make a local copy of 'rs' (fix#2).

In profiling my executables, apply_reg_state consumes the most cycles,
so I personally prefer fix#2, though it is more expensive for single-threaded

Either fix makes the test case work, and produces no regressions on

Paul Pluzhnikov

Attachment: libunwind-fix-race-20091120-1.txt
Description: Text document

Attachment: foo.c
Description: Text Data

Attachment: libunwind-fix-race-20091120-2.txt
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]