octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Thread-safe reference counting (modified dim-vector.h)


From: Jarno Rajahalme
Subject: Thread-safe reference counting (modified dim-vector.h)
Date: Wed, 26 May 2010 01:39:41 -0700

I have included a modified dim-vector.h that does what was proposed below. 
However, it seems other changes are needed to avoid subtle race conditions. I 
have tried to address these issues, but more testing may be needed. See the 
attached .diff file.

Attachment: dim-vector-atomic-gnu.h
Description: Binary data

 

Attachment: dim-vector-atomic-gnu.diff
Description: Binary data


The attached dim-vector-atomic-gnu.h uses GNU builtins if ATOMIC_GNU is 
defined, OpenMP atomic if ATOMIC_OPENMP is defined, and "no atomic" if neither 
is defined.

Without atomic constructs the test code below fails on either malloc error, or 
on the assert() just before the delete in the dim-vector-atomic-gnu.h.  With 
either ATOMIC_GNU or ATOMIC_OPENMP the test completes without problems. GNU 
builtins are a bit faster, it seems, but either increases the execution (CPU) 
time about 15x.

This can be tested with this simple c++ code:

main.cc:

#include <octave/config.h>
#include <omp.h>
#include "dim-vector-atomic-gnu.h"

const dim_vector& dims(void); // in a different compilation unit

int main (void)
{
  int result;

#pragma omp parallel for reduction (+: result)
  for (int i=0; i < 10000000; i++) {
    const dim_vector dv = dims(); // clone
//    const dim_vector& dv = dims(); // reference
    result += dv(1);
  }
  return result;
}


sub.cc:

#include <octave/config.h>
#include "dim-vector-atomic-gnu.h"

const dim_vector& dims(void)
{
  static const dim_vector dv(1,1);
  return dv;
}

Compile with:

g++ -fopenmp -DATOMIC_GNU -I../include/octave-3.3.51+ -O3 main.cc sub.cc -lgomp 
-o dimtest

- you may need to modify the octave include directory
- I tested with GCC 4.5
- try with and without -DATOMIC_GNU
- try with either // clone or // reference line
 - in my testing, without atomic, and when cloning, the code will reliably fail
 - taking a reference, it always works, and the atomic code (even if used) is 
not executed within the loop.
- use "time ./dimtest"
 - There is about 24x overhead for creating the local object and atomically 
fiddling with the reference counter.

dims (), when cloned, is same as the octave_base_scalar::dims (), so the above 
can happen in real octave code as well. In Octave the corresponding function is 
virtual, so it cannot be inlined, so there is no chance for the compiler to 
optimize the reference counting away.

However, by taking a reference, the reference counting does not kick in even if 
dims() is virtual and the code above becomes a lot faster.

Regards,

  Jarno


On May 25, 2010, at 12:31 PM, ext Jarno Rajahalme wrote:

> 
> On May 25, 2010, at 2:23 AM, ext Jaroslav Hajek wrote:
>> 
>> In the simplest setup, counter increments and decrements should be
>> decorated by #pragma omp atomic.
>> I think we could have a set of macros, similar to Py_INCREF et al, for
>> doing this.
> 
> Would this use a global lock? That would be rather bad, I think.
> 
> GCC 4.1 and above has builtins for atomic increment / decrement 
> (__sync_sub_and_fetch () etc.). This site has very good intro to the topic:
> 
> http://golubenco.org/2007/06/14/atomic-operations/
> 
> Looks pretty straightforward. Apparently ICC implements these too, so it is 
> not GCC only.
> 
>       Jarno
> 
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]