Le 01/09/2016 à 17:42, Rik a écrit :
On 09/01/2016 03:38 AM, address@hidden
wrote:
> Subject:
> A question about NDArray, xelem and operator ()
> From:
> Julien Bect <address@hidden>
> Date:
> 09/01/2016 01:40 AM
> To:
> octave-maintainers <address@hidden>
> List-Post:
> <mailto:address@hidden>
> Content-Transfer-Encoding:
> 7bit
> Precedence:
> list
> MIME-Version:
> 1.0
> Message-ID:
> <address@hidden>
> Content-Type:
> text/plain; charset=utf-8; format=flowed
> Message:
> 4
>
> Hi all,
>
> While I was working on the GSL package the following question arose:
>
> what is the difference between using A(i) and A.xelem(i) when I need to read a value from an array A in an oct-file ?
>
> (A is an NDArray and i of type octave_idx_type)
>
> @++
> Julien
Julien,
Generally you should be using the operator syntax A(i) within an
oct-file to access an element. The xelem() method is for raw
access to the underlying data with absolutely no checking
whatsoever (no bounds checking for indices outside the size of the
Array, no checks for reference counts > 1, i.e, that someone
else has a shared copy of this data). This is okay in the Octave
core when we want higher performance AND can guarantee that it is
safe to use. Dynamically linked oct-files have a great potential
for de-stabilizing Octave since we have no control over how
careful the programmer is with memory references, etc.
If you intend to iterate over every element in an array and
possibly change it then there is a slight performance hit (~10%)
to using a straight for loop
for (octave_idx_type i = 0; i < x.numel (); i++)
x(i) += 1;
In that case, you are better off getting a pointer to the actual
storage with data() and working on that. But, optimization should
always come after code correctness. The code above is easy to
understand and most of the time 10% is not the problem.
For an example of the optimized approach, see the map() function
in Array.h which calls a supplied function (such as cos, sin,
etc.) on every element. This has an additional optimization of
working on a stride of 4 elements each time in order to limit the
overhead of calling octave_quit () for every single element in the
array.
//! Apply function fcn to each element of the Array<T>.
This function
//! is optimized with a manually unrolled loop.
template <typename U, typename F>
Array<U>
map (F fcn) const
{
octave_idx_type len = numel ();
const T *m = data ();
Array<U> result (dims ());
U *p = result.fortran_vec ();
octave_idx_type i;
for (i = 0; i < len - 3; i += 4)
{
octave_quit ();
p[i] = fcn (m[i]);
p[i+1] = fcn (m[i+1]);
p[i+2] = fcn (m[i+2]);
p[i+3] = fcn (m[i+3]);
}
octave_quit ();
for (; i < len; i++)
p[i] = fcn (m[i]);
return result;
}
Thank you Rik for the explanation. (Remark: perhaps some of this
could go into the doxygen documentation, somewhere ?)
For the gsl package I have chosen the optimized version, since it
has to do with a generic wrapper (the overhead might be negligible
for some heavy special functions, but not-so-negligible for faster
ones).
The use case is very similar to the map function above, so I don't
think it is "risky".
@++
Julien
|