coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] sort: fix two race conditions reported by valgrind.


From: Pádraig Brady
Subject: Re: [PATCH] sort: fix two race conditions reported by valgrind.
Date: Tue, 14 Jan 2014 02:39:07 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 01/14/2014 02:08 AM, Pádraig Brady wrote:
> On 01/14/2014 12:49 AM, Shayan Pooya wrote:
>>
>>
>>
>> On Mon, Jan 13, 2014 at 7:31 PM, Pádraig Brady <address@hidden 
>> <mailto:address@hidden>> wrote:
>>
>>     On 01/14/2014 12:22 AM, Shayan Pooya wrote:
>>     > Valgrind reported two race conditions when I ran sort on a small file.
>>     > Both of them seem to be legitimate.
>>     > ---
>>     >  src/sort.c | 4 ++--
>>     >  1 file changed, 2 insertions(+), 2 deletions(-)
>>     >
>>     > diff --git a/src/sort.c b/src/sort.c
>>     > index 3380be6..e6658e6 100644
>>     > --- a/src/sort.c
>>     > +++ b/src/sort.c
>>     > @@ -3354,8 +3354,8 @@ queue_insert (struct merge_node_queue *queue, 
>> struct merge_node *node)
>>     >    pthread_mutex_lock (&queue->mutex);
>>     >    heap_insert (queue->priority_queue, node);
>>     >    node->queued = true;
>>     > -  pthread_mutex_unlock (&queue->mutex);
>>     >    pthread_cond_signal (&queue->cond);
>>     > +  pthread_mutex_unlock (&queue->mutex);
> 
> valgrind is not flagging the above for me?

Though according to http://valgrind.org/docs/manual/drd-manual.html
valgrind should be flagging this as it mentions this is a usually a mistake 
because...

"Sending a signal to a condition variable while no lock is held on the mutex 
associated with the condition variable.
This is a common programming error which can cause subtle race conditions and 
unpredictable behavior.
There exist some uncommon synchronization patterns however where it is safe to 
send a signal without holding a lock on the associated mutex"

Ah, if I bump up the size of the file to `seq 1000000` I see the error:
"Probably a race condition: condition variable 0xffeffffb0 has been signaled but
the associated mutex 0xffeffff88 is not locked by the signalling thread."

With your full patch applied but with the larger input I'm now getting lots of:

==15443== Thread 3:
==15443== Conflicting store by thread 3 at 0x0541a7d8 size 8
==15443==    at 0x4084EB: sortlines (sort.c:3432)
==15443==    by 0x408907: sortlines_thread (sort.c:3571)
==15443==    by 0x4C2BDDB: ??? (in 
/usr/lib64/valgrind/vgpreload_drd-amd64-linux.so)
==15443==    by 0x4E44F32: start_thread (in /usr/lib64/libpthread-2.18.so)
==15443==    by 0x514EEAC: clone (in /usr/lib64/libc-2.18.so)
==15443== Address 0x541a7d8 is at offset 280 from 0x541a6c0. Allocation context:
==15443==    at 0x4C2938D: malloc (in 
/usr/lib64/valgrind/vgpreload_drd-amd64-linux.so)
==15443==    by 0x40F9F8: xmalloc (xmalloc.c:41)
==15443==    by 0x40422F: main (sort.c:3223)
==15443== Other segment start (thread 2)
==15443==    at 0x4C2E843: pthread_mutex_lock (in 
/usr/lib64/valgrind/vgpreload_drd-amd64-linux.so)
==15443==    by 0x408323: sortlines (sort.c:3313)
==15443==    by 0x4088BF: sortlines (sort.c:3618)
==15443==    by 0x408907: sortlines_thread (sort.c:3571)
==15443==    by 0x4C2BDDB: ??? (in 
/usr/lib64/valgrind/vgpreload_drd-amd64-linux.so)
==15443==    by 0x4E44F32: start_thread (in /usr/lib64/libpthread-2.18.so)
==15443==    by 0x514EEAC: clone (in /usr/lib64/libc-2.18.so)
==15443== Other segment end (thread 2)
==15443==    at 0x4E4E600: __errno_location (in /usr/lib64/libpthread-2.18.so)
==15443==    by 0x4117B2: memcoll0 (memcoll.c:39)
==15443==    by 0x40FE08: xmemcoll0 (xmemcoll.c:71)
==15443==    by 0x407CBB: compare (sort.c:2731)
==15443==    by 0x4083EC: sortlines (sort.c:3418)
==15443==    by 0x4088BF: sortlines (sort.c:3618)
==15443==    by 0x408907: sortlines_thread (sort.c:3571)
==15443==    by 0x4C2BDDB: ??? (in 
/usr/lib64/valgrind/vgpreload_drd-amd64-linux.so)
==15443==    by 0x4E44F32: start_thread (in /usr/lib64/libpthread-2.18.so)
==15443==    by 0x514EEAC: clone (in /usr/lib64/libc-2.18.so)

Hopefully that's a false positive.
I'll do some more investigation tomorrow.

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]