help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: help with vectorization of update rule


From: James Sherman Jr.
Subject: Re: help with vectorization of update rule
Date: Thu, 21 Oct 2010 12:21:22 -0400

On Thu, Oct 21, 2010 at 11:48 AM, grg <address@hidden> wrote:


Søren Hauberg wrote:
>
> Hi,
>
> You can do something like this
>
>         ## Problem parameters
>         A = 2;
>         T = 10;
>         x0 = rand (20, 1);
>
>         ## Loop implementation
>         x = x0;
>         for it=1:T
>             x(:,it+1)=A*x(:,it);
>         endfor
>
>         ## Vectorised implementation
>         y = repmat (x0, 1, T+1);
>         y = repmat (x0, 1, T+1) .* repmat (A.^(0:T), rows (x0), 1);
>
>         ## Check (should return 1)
>         isequal (x, y)
>
> Søren
>
>
>
>

hi Søren,

Thanks a lot for your prompt response.

I tried your code and checked the execution times:

       ## Problem parameters
       A = 2;
       T = 10;
       x0 = rand (20, 1);

       tic
       ## Loop implementation
       x = x0;
       for it=1:T
           x(:,it+1)=A*x(:,it);
       endfor
       toc

       tic
       ## Vectorised implementation
       y = repmat (x0, 1, T+1);
       y = repmat (x0, 1, T+1) .* repmat (A.^(0:T), rows (x0), 1);
       toc

       ## Check (should return 1)
       isequal (x, y)


The output was:

   octave:11> test_vec
   Elapsed time is 0.00022 seconds.
   Elapsed time is 0.0016 seconds.
   ans =  1
   octave:12>

So, in this case the vectorized version of the code is significantly slower
than the for loop.  Is this normal?

Thanks again

Giorgio

I believe that it is because the vectorization in this case takes many more multiplications and additions than the non-vectorized version.  Your initial implementation takes one matrix multiplication (s by s times s by 20) per "it" (T of them in this case).

The vectorized version essentially calculates the matrix exponential at every time instant independently of each other, so for say it = 3, it calculates A^3 then multiplies it by x0 (2 multiplications of s by s, then one of s by s times s by 20).  Then for it = 4, it calculates A^4 then multiplies it by x0. (3 multiplications of s by s, then one of s by s times s by 20)   So you have many more calculations to perform.

Thats my semi-informed guess of why the vectorization in this case takes much longer.
 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]