octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #52809] interpreter performance is slow on dev


From: Dan Sebald
Subject: [Octave-bug-tracker] [bug #52809] interpreter performance is slow on development branch
Date: Fri, 5 Jan 2018 16:24:52 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0

Follow-up Comment #7, bug #52809 (project octave):

Here's a breakdown of the results after applying the patch:

In the following there's nothing on the statement list, so there is no
surprise there's no change, except when the evaluation of the inner for-loop
comes into play, for which there seems to be 40% improvement.


octave:1> for lim_p = 0:6
>   lim1 = 10^lim_p;
>   lim2 = 10^(6-lim_p);
>   a = 1; b = 1; t0=tic; for i=1:lim1; for j=1:lim2; end; end; t1=toc(t0);
t1
> end

BEFORE PATCH     AFTER PATCH
t1 =  0.18503    t1 =  0.18796
t1 =  0.18548    t1 =  0.18670
t1 =  0.18704    t1 =  0.18585
t1 =  0.19305    t1 =  0.18805
t1 =  0.24489    t1 =  0.21950
t1 =  0.75854    t1 =  0.53110
t1 =  5.8472     t1 =  3.5870 


In the following there are no variables, just the constant.  There is a 60%
improvement when the evaluation of the inner for-loop isn't a factor (probably
because there is no longer anything done to save variable memory), while again
a 40% improvement when the inner-for loop evaluation comes into play.


octave:2> for lim_p = 0:6
>   lim1 = 10^lim_p;
>   lim2 = 10^(6-lim_p);
>   a = 1; b = 1; t0=tic; for i=1:lim1; for j=1:lim2; 1; end; end; t1=toc(t0);
t1
> end

BEFORE PATCH     AFTER PATCH
t1 =  1.0119     t1 =  0.43711
t1 =  1.0167     t1 =  0.43716
t1 =  1.0118     t1 =  0.43754
t1 =  1.0166     t1 =  0.44116
t1 =  1.0708     t1 =  0.47755
t1 =  1.6114     t1 =  0.83763
t1 =  6.9960     t1 =  4.3774 


The next case is when there is some variable evaluation.  Now there is only
50% improvement when the inner for-loop evaluation is not dominant.  (Still
about 40% improvement when it is dominant.)


octave:3> for lim_p = 0:6
>   lim1 = 10^lim_p;
>   lim2 = 10^(6-lim_p);
>   a = 1; b = 1; t0=tic; for i=1:lim1; for j=1:lim2; a=b; end; end;
t1=toc(t0); t1
> end

BEFORE PATCH     AFTER PATCH
t1 =  3.1692     t1 =  1.7681
t1 =  3.1592     t1 =  1.8832
t1 =  3.1674     t1 =  1.8906
t1 =  3.1748     t1 =  1.8875
t1 =  3.2181     t1 =  1.9335
t1 =  3.7610     t1 =  2.2881
t1 =  9.0173     t1 =  5.7604 


And the next is again with variable evaluation, but this time a matrix
assignment.  This is pretty much the same improvement, relatively, as in the
previous example.  But look carefully comparing the first column of the
previous example and this example, and then the second column of the previous
example and this example.  (I believe I have those numbers correct.  At least
I double-checked.)  In the previous example (the scalar assignment) before the
patch it was slightly less CPU usage than the following example (the matrix
assignment) before the patch.  One would think that is logical--more memory
movement, more CPU usage (although small compared to the evaluator).  However,
after the patch, this relationship has reversed: the scalar assignment appears
to take a fraction to one or two percent more CPU usage, which is
counter-intuitive.  I do see though that in the patch it is broken up into the
scalar and "multi" cases, so that might explain the difference.


octave:4> for lim_p = 0:6
>   lim1 = 10^lim_p;
>   lim2 = 10^(6-lim_p);
>   a = 1; b = ones(10000); t0=tic; for i=1:lim1; for j=1:lim2; a=b; end; end;
t1=toc(t0); t1
> end

BEFORE PATCH     AFTER PATCH
t1 =  3.2118     t1 =  1.6255
t1 =  3.2141     t1 =  1.7589
t1 =  3.2061     t1 =  1.7608
t1 =  3.2156     t1 =  1.7673
t1 =  3.2567     t1 =  1.8127
t1 =  3.7914     t1 =  2.1563
t1 =  9.1665     t1 =  5.7264 


In summary:

1) About 40% reduction in CPU usage related to memory management of
evaluation.
2) About 60% reduction in CPU usage related to the need to do memory
management during evaluation.
3) Some peculiar but not too significant difference having to deal with
scalar/matrix memory storage during looping.

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?52809>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]