[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #55909] Performance of hist
From: |
Rik |
Subject: |
[Octave-bug-tracker] [bug #55909] Performance of hist |
Date: |
Fri, 31 May 2019 14:46:39 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko |
Update of bug #55909 (project octave):
Status: None => Patch Reviewed
_______________________________________________________
Follow-up Comment #6:
Okay, now I can spot the difference. Here are test results
################################################################################
Current Implementation (N = 10)
1e6: 0.028456
1e4: 0.000546
1e3: 0.000537
1e2: 0.000283
################################################################################
New Implementation (N = 10)
1e6: 0.027834
1e4: 0.000546
1e3: 0.000539
1e2: 0.000302
################################################################################
Current Implementation (N = 30)
1e6: 0.153529
1e4: 0.001585
1e3: 0.001599
1e2: 0.000714
################################################################################
New Implementation (N = 30)
1e6: 0.022552
1e4: 0.000548
1e3: 0.000541
1e2: 0.000399
As noted, for small N there is no difference in performance. That's actually
good, the changes haven't disrupted anything. For large N, the new algorithm
is 2-7X faster.
I made some additional modifications for review. Instead of isequal() I used
strcmp because it is faster (even though this only occurs once in input
validation). Second, I think it is worthwhile to detect equally-spaced bins
even when they aren't vectors. I added code to do that as well which shifts a
specification of N like "linspace (1,30,30)" in to using the faster
algorithm.
+ x_is_range = strcmp (typeinfo (x), "range");
+ if (! x_is_range)
+ diffs = diff (x);
+ if (all (diffs == diffs(1)))
+ x_is_range = true;
+ endif
+ endif
Should the variable "x_is_range" be renamed to be more reflective of what it
means, which is that the bins are equally spaced?
The use of in-place operators is a simple performance improvement so I made
this change as well
- freq = freq .* norm ./ sum (! isnan (y));
+ freq .*= norm ./ sum (! nanidx);
Finally, I also updated certain BIST tests because the underlying code has
changed.
@@ -257,9 +286,9 @@ endfunction
%! assert (nn, n);
%! assert (xx, x);
%!
-%! ## test again with N > 30 because there's a special case for it
-%! [n, x] = hist (y, 35);
-%! [nn, xx] = hist (uint8 (y), 35);
+%! ## test again with N > 26 because there's a special case for it
+%! [n, x] = hist (y, 30);
+%! [nn, xx] = hist (uint8 (y), 30);
%! assert (nn, n);
%! assert (xx, x);
@@ -271,9 +300,9 @@ endfunction
%! assert (nn, n);
%! assert (xx, x);
%!
-%! ## test again with N > 30 because there's a special case for it
-%! [n, x] = hist (y, 35);
-%! [nn, xx] = hist (logical (y), 35);
+%! ## test again with N > 26 because there's a special case for it
+%! [n, x] = hist (y, 30);
+%! [nn, xx] = hist (logical (y), 30);
%! assert (nn, n);
%! assert (xx, x);
I fleshed out the changeset to have a commit message and to cite Michael
Leitner as the author. See the attached bug55909.cset.
(file #47008)
_______________________________________________________
Additional Item Attachment:
File name: bug55909.cset Size:4 KB
<https://savannah.gnu.org/file/bug55909.cset?file_id=47008>
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?55909>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/