octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #56752] Performance slowdown from version 3.2.


From: Carlo de Falco
Subject: [Octave-bug-tracker] [bug #56752] Performance slowdown from version 3.2.4 through to current dev branch
Date: Fri, 13 Sep 2019 05:11:24 -0400 (EDT)
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:69.0) Gecko/20100101 Firefox/69.0

Follow-up Comment #44, bug #56752 (project octave):

[comment #43 comment #43:]
> The shallow copy of the array isn't the real problem.  It's that even if
there is no other copy and the reference count is always 1, the usual array
indexing operations for non-const instances of our copy-on-write classes must
always check the reference count.  The NoAlias class provides a way to
override that choice, but it is risky because we have to know that there will
never be a copy.  If we know that, then I think it makes sense to just use
xelem explicitly.
> 
> Another way to avoid the reference count checks is to use const
copy-on-write objects wherever possible.  We probably don't follow that advice
as much as we should in the Octave sources.
> 

@jwe I understand the purpose of NoAlias, but in this context the copy IS the
issue.

The pattern that clang++ warned about in Sparse.cc can be summarized as
follows



template<class ArrayType> ArrayType foo ( ... ) {

 NoAlias<ArrayType> r (d);

 for (i = 0; i < r.numel (); ++i) {
   r(i) = foobar ();
 }

 return r;
}



Here the use of NoAlias does indeed serve its purpose of optimizing access to
elements of r by avoiding unneeded checks.

But it has an unwanted side effect due to the fact that the return value is
not 
of type ArrayType but of a derived type.

This prevents the compiler for optimizing away the copy of the temporary
object r
when it is returned.

If one wanted to maintain the NoAlias wrapper and avoid the copy, then the
compiler
suggest using 


 return std::move (r);


Using a const arry here makes no sense as the purpose of the function is to
fill its elements.

My solution in https://hg.savannah.gnu.org/hgweb/octave/rev/e1968e40e43f was
instead to 
allocate r with the base type directly and to use ArrayType::xelem () for
faster access 
to its elements



template<class ArrayType> ArrayType foo ( ... ) {

 ArrayType r (d);

 for (i = 0; i < r.numel (); ++i) {
   r.xelem (i) = foobar ();
 }

 return r;
}




In comment #41 Rik was wondering whether the same change would need to be done
in other
places where NoAlias is used. In comment #42 I was guessing that when the
NoAlias<ArrayType> 
is not returned by value itself but passed to the octave_value constructor
there is not an
automatic optimization that the compiler can do.




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?56752>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]