octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: slow LU decomposiotion? for octave 4 for windows


From: Tatsuro MATSUOKA
Subject: Re: slow LU decomposiotion? for octave 4 for windows
Date: Wed, 1 Jun 2016 17:40:56 +0900 (JST)

----- Original Message -----

> From: Tatsuro MATSUOKA 
> To: siko1056  "octave-maintainers
> Cc: 
> Date: 2016/6/1, Wed 07:50
> Subject: Re: slow LU decomposiotion? for octave  4 for windows
> 
> ----- Original Message -----
> 
>>  From: siko1056 
>>  To: octave-maintainers
>>  Cc: 
>>  Date: 2016/5/31, Tue 21:07
>>  Subject: Re: slow LU decomposiotion? for octave  4 for windows
>> 
>>  Dear Tatsuro MATSUOKA,
>> 
>>  Do you also this performance issue for the vectorized test script in all
>>  your versions? I see similar good results on my PC without for-loops and
>>  Octave 4.0.1 and dev.
>> 
>>  ------------------
>>  more off 
>> 
>>  Num=1000;
>>  rand('seed',1); 
>>  A=rand(Num)-0.5; 
>>  rand('seed',2); 
>>  B=rand(Num,ItNum)-0.5; 
>>  [L U P]=lu(A); 
>>  % 
>>  disp('Simple left division'); 
>>  tic; 
>>  x=A\B;  % vectorized
>>  toc; 
>>  x1=x(:,end); 
>>  % 
>>  disp('LU decomposition'); 
>>  tic; 
>>  c=P*B; y=L\c; x=U\y;  % vectorized
>>  toc; 
>>  x2=x(:,end); 
>>  id=1:Num; 
>>  plot(id,x1, 'o1',id,x2, '+2'); 
>>  ------------------
>> 
>>  This would make me struggle. For-loops are slow (Rik held an excellent talk
>>  about this at the Octconf 2015, http://wiki.octave.org/OctConf_2015) and
>>  maybe older versions of Octave handled loops in another way?!
>> 
>>  Regards,
>>  Kai
>> 
> Kai
> 
> Thank you for vectorized test script.
> 
> One correction
> 
> Num=1000;
> |
> V
> Num=1000; ItNum=10;
> 
> 
> 
> ************************************ 
> Octave-3.2.4 mingw 
>>>  lutest_v 
> 
> Simple left division
> Elapsed time is 0.114076 seconds.
> LU decomposition
> Elapsed time is 0.00600397 seconds.************************************ 
> Octave-3.6.4 msvc 
> 
> 
>>>  lutest_v
> Simple left division
> Elapsed time is 0.11 seconds.
> LU decomposition
> Elapsed time is 0.00399995 seconds.
> ************************************ 
> Octave-3.6.4 mingw 
> 
> 
>>>  lutest_v
> Simple left division
> Elapsed time is 0.0790549 seconds.
> LU decomposition
> Elapsed time is 0.00300312 seconds.
> ************************************ 
> Octave-3.8.2 mingw 
> 
> 
>>>  lutest_v
> Simple left division
> Elapsed time is 0.096066 seconds.
> LU decomposition
> Elapsed time is 0.00300384 seconds.
> ************************************ 
> Octave-4.0.0 mingw (32 bit) 
> 
> 
>>>  lutest_v
> Simple left division
> Elapsed time is 0.07305 seconds.
> LU decomposition
> Elapsed time is 0.01701 seconds.
> ************************************ 
> Octave-4.0.2 mingw (32 bit) 
> 
> Simple left division
> Elapsed time is 0.060039 seconds.
> LU decomposition
> Elapsed time is 0.0180109 seconds.
> ************************************ 
> Octave-4.0.2 mingw (64 bit) 
> 
>>>  lutest 
> 
> Simple left division
> Elapsed time is 0.0500329 seconds.
> LU decomposition
> Elapsed time is 0.00900698 seconds.
> ************************************ 
> octave-4.1.0+ mingw (64bit) 
> (hg clone on May 28, 2016) 
> 
>>>  lutest 
> 
> Simple left division
> Elapsed time is 0.213143 seconds.
> LU decomposition
> Elapsed time is 0.0230169 seconds.************************************
> 
> 
> As you told, there is issue of slow loop.
> However, even vetorized, LU decomposition on version 4 on windows is slower 
> than 
> version  3.
> Surprisingly, results on 4.1.0+ are the worst. This is a bad situation and 
> should be improved.
> 
> Is it better to be file to a bug?
> 
> Tasuro 


Before filing this to the bug tracker I show test results on Ubuntu 14.04 64bit.
(Athlon X2 not so fast)


% lutest_v.m
more off 
 
Num=1000; ItNum=10;
rand('seed',1); 
A=rand(Num)-0.5; 
rand('seed',2); 
B=rand(Num,ItNum)-0.5; 
[L U P]=lu(A); 
% 
disp('Simple left division'); 
tic; 
x=A\B;  % vectorized
toc; 
x1=x(:,end); 
% 
disp('LU decomposition'); 
tic; 
c=P*B; y=L\c; x=U\y;  % vectorized
toc; 
x2=x(:,end); 
id=1:Num; 
plot(id,x1, 'o1',id,x2, '+2'); 

% end of lutest_v.m


************************************

3.8.1 from Ubuntu repository 
>> lutest_v
Simple left division
Elapsed time is 0.396069 seconds.
LU decomposition
Elapsed time is 0.0427001 seconds.
************************************
4.0.0 bult myself (gcc 4.8.4) 
>> lutest_v
Simple left division
Elapsed time is 0.378708 seconds.
LU decomposition
Elapsed time is 0.071897 seconds.
************************************

4.0.2 bult myself (gcc 4.8.4) 
>> lutest_v
Simple left division
Elapsed time is 0.377592 seconds.
LU decomposition
Elapsed time is 0.0599658 seconds.
************************************

4.1.0+ bult myself (gcc 4.8.4) (cloned 2016-06-01 JST)
>> lutest_v
Simple left division
Elapsed time is 0.378928 seconds.
LU decomposition
Elapsed time is 0.0592132 seconds.
************************************


For simple division, the differences within tolerance.
The LU decomposition is the slowest on 4.0.0 and fastest on 3.8.1.
But difference are small comparing the cases on windows.
In addition. slowness of simple division observed on 4.1.0+ does not appear on 
Ubuntu 14.04 64 bit.

Slowness of LU decomposition on octave 4 windows is not allowable, I think.

I would like to see test by other people and opinions.

Regards

******************************************
Tatsuro MATSUOKA

Department of Chemical Engineering
Nagoaya University, Japan
******************************************



reply via email to

[Prev in Thread] Current Thread [Next in Thread]