[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Fwd: 'for' loop vectorization
From: |
David Bateman |
Subject: |
Re: Fwd: 'for' loop vectorization |
Date: |
Thu, 25 Oct 2007 19:04:22 +0200 |
User-agent: |
Thunderbird 1.5.0.7 (X11/20060921) |
Francesco Potorti` wrote:
> This mail does not contain references to previous mails on the subject,
> as I am writing without being subscribed to this list.
>
> I do not know if this piece of information can add anything to the
> discussion, but I can make two observations. First is that I managed to
> obtain a humble 10% improvement over triu.m by vectorising it like this:
>
> octave> n=5000; a=ones(n,n);
> octave> function z=vtriu(z) n=rows(z); z((1:n)'*ones(1,n)>ones(n,1)*(1:n))=0;
> end
> octave> t=cputime;b=triu(a);cputime-t
> ans = 1.7081
> octave> t=cputime;c=vtriu(a);cputime-t
> ans = 1.5081
> octave> all(all(b==c))
> ans = 1
>
> The idea is that I build a vector of the matrix indices that are under
> the diagonal, and then I zero them.
> Maybe some variation on this concept could make triu and tril faster.
>
> Second is an observation on the proposed cumulative max function using
> triu: it requires space proportional to n^2 for a vector of length n, so
> it is not abvious that is worth optimising, because it cannot be used
> for big vectors, that is, vectors of length n such that n^2 does not fit
> into memory.
>
Here is what I see with your code on my machine for your function and
the oct-file version I sent (which was in the directory devel/).
octave:1> n=3000; a=ones(n,n);
octave:2> function z=vtriu(z) n=rows(z);
z((1:n)'*ones(1,n)>ones(n,1)*(1:n))=0; end
octave:3> t=cputime;a=triu(a);cputime-t
ans = 0.98785
octave:4> t=cputime;b=vtriu(a);cputime-t
ans = 0.92986
octave:5> cd devel
octave:6> t=cputime;c=triu(a);cputime-t
ans = 0.23496
octave:7> all(all(a==b))
ans = 1
octave:8> all(all(a==c))
ans = 1
There is a 4 times speed-up for the oct-file version..
D.