octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #55909] Performance of hist


From: Rik
Subject: [Octave-bug-tracker] [bug #55909] Performance of hist
Date: Fri, 31 May 2019 14:46:39 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko

Update of bug #55909 (project octave):

                  Status:                    None => Patch Reviewed         

    _______________________________________________________

Follow-up Comment #6:

Okay, now I can spot the difference.  Here are test results


################################################################################
Current Implementation (N = 10)
1e6: 0.028456
1e4: 0.000546
1e3: 0.000537
1e2: 0.000283

################################################################################
New Implementation (N = 10)
1e6: 0.027834
1e4: 0.000546
1e3: 0.000539
1e2: 0.000302

################################################################################
Current Implementation (N = 30)
1e6: 0.153529
1e4: 0.001585
1e3: 0.001599
1e2: 0.000714

################################################################################
New Implementation (N = 30)
1e6: 0.022552
1e4: 0.000548
1e3: 0.000541
1e2: 0.000399


As noted, for small N there is no difference in performance.  That's actually
good, the changes haven't disrupted anything.  For large N, the new algorithm
is 2-7X faster.

I made some additional modifications for review.  Instead of isequal() I used
strcmp because it is faster (even though this only occurs once in input
validation).  Second, I think it is worthwhile to detect equally-spaced bins
even when they aren't vectors.  I added code to do that as well which shifts a
specification of N like "linspace (1,30,30)" in to using the faster
algorithm.


+      x_is_range = strcmp (typeinfo (x), "range");
+      if (! x_is_range)
+        diffs = diff (x);
+        if (all (diffs == diffs(1)))
+          x_is_range = true;
+        endif
+      endif


Should the variable "x_is_range" be renamed to be more reflective of what it
means, which is that the bins are equally spaced?

The use of in-place operators is a simple performance improvement so I made
this change as well


-    freq = freq .* norm ./ sum (! isnan (y));
+    freq .*= norm ./ sum (! nanidx);


Finally, I also updated certain BIST tests because the underlying code has
changed.


@@ -257,9 +286,9 @@ endfunction
 %! assert (nn, n);
 %! assert (xx, x);
 %!
-%! ## test again with N > 30 because there's a special case for it
-%! [n, x] = hist (y, 35);
-%! [nn, xx] = hist (uint8 (y), 35);
+%! ## test again with N > 26 because there's a special case for it
+%! [n, x] = hist (y, 30);
+%! [nn, xx] = hist (uint8 (y), 30);
 %! assert (nn, n);
 %! assert (xx, x);
 
@@ -271,9 +300,9 @@ endfunction
 %! assert (nn, n);
 %! assert (xx, x);
 %!
-%! ## test again with N > 30 because there's a special case for it
-%! [n, x] = hist (y, 35);
-%! [nn, xx] = hist (logical (y), 35);
+%! ## test again with N > 26 because there's a special case for it
+%! [n, x] = hist (y, 30);
+%! [nn, xx] = hist (logical (y), 30);
 %! assert (nn, n);
 %! assert (xx, x);


I fleshed out the changeset to have a commit message and to cite Michael
Leitner as the author.  See the attached bug55909.cset.


(file #47008)
    _______________________________________________________

Additional Item Attachment:

File name: bug55909.cset                  Size:4 KB
    <https://savannah.gnu.org/file/bug55909.cset?file_id=47008>



    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?55909>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]