[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [OctDev] Question on performance, coding style and competitive softw
From: |
Alois Schlögl |
Subject: |
Re: [OctDev] Question on performance, coding style and competitive software |
Date: |
Thu, 23 Apr 2009 11:18:54 +0200 |
User-agent: |
Thunderbird 2.0.0.21 (X11/20090318) |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Jaroslav Hajek wrote:
> On Wed, Apr 22, 2009 at 4:18 PM, Alois Schlögl <address@hidden> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> David Bateman wrote:
>>> Alois Schlögl wrote:
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>>
>>>> As some of you might know, my other pet project besides Octave, is
>>>> BioSig http://biosig.sf.net. BioSig is designed in such a way that it
>>>> can be used with both, Matlab and Octave. Mostly for performance reason,
>>>> we cannot abandon support for Matlab [1,2]. Octave is a viable
>>>> alternative in case the computational performance is not important. In
>>>> order to decide on the future strategy of BioSig, I hope to get answers
>>>> on the following questions:
>>>>
>>>> 1) Core development of Octave:
>>>> At the meeting of Octave developer in 2006, the issue was raised
>>>> that the Octave is about 4 to 5 times slower than Matlab [1]. (I
>>>> repeated the tests just recently, the results are attached below, and
>>>> show a difference of factors up to 13, average ~5) This issue is most
>>>> relevant for large computational jobs, were it makes a difference
>>>> whether a specific task takes 1 day or 5 days. Is anyone working to
>>>> address this problem? Is there any hope that the performance penalty
>>>> becomes smaller or will go away within a reasonable amount of time ?
>>>>
>>> Its hard to tell what the source of your speed issues are.. The flippant
>>> response would be that with a JIT in octave then yes we could be as
>>> fast, we just need someone to write it. I suspect something will be done
>>> here in the future. The recent changes of John to have an evaluator
>>> class and his statement of adding a profiler in Octave 3.4 mean that the
>>> machinery needed to add a JIT will be in place.
>>
>> Good to know that someone is working on this. However, as far as I
>> understand its currently not possible to estimate when the performance
>> penalty is expected to be nullified.
>>
>
> Agreed. And I am more pessimistic than David about JIT in near future,
> unless someone gets funding for that (maybe via GSoC or something).
>
>>> However looking at your wackerman its not clear to me that its your
>>> for-loop that is taking all of the time in Octave. If it is have you
>>> considered rewriting
>>>
>>> for k = 1:size(m2,1),
>>> if all(finite(m2(k,:))),
>>> L = eig(reshape(m2(k,:), [K,K]));
>>> L = L./sum(L);
>>> if all(L>0)
>>> OMEGA(k) = -sum(L.*log(L));
>>> end;
>>> end;
>>> end;
>>>
>>> with something like
>>>
>>> rows_m2 = size(m2, 1);
>>> m3 = permute (reshape (m2, [rows_m2, K, K]), [2, 3, 1]);
>>> idx = all (finite (m2), 1);
>>> t = cellfun (@(x) eig(x), mat2cell (m3 (:, :, idx), K, K, ones(1,
>>> rows_m2)),
>>> 'UniformOutput', false);
>>> t = cellfun (@(x) - sum (x .* log (x)),
>>> cellfun (@(x) x ./ sum(x), 'UniformOutput', false));
>>> t(iscomplex(t)) = NaN;
>>> OMEGA(idx) = t;
>>>
>>> The code above is of course untested. But in the latest tip that should
>>> be much faster for Octave as Jaroslav optimized cellfun recently
>>
>> Using Jaroslav's code and some modifications (diag of 300000 element
>> vector was just too large)
>
> In 3.1.5x this is no longer an issue, because diagonal matrices are
> optimized. In 3.0.x, I think you can use "dmult" to do the row
> scaling. Or the outer product trick you do below, but the diag
> expression is both more readable and faster in 3.1.5x. Sorry, I just
> tend to think in terms of the development version :)
Thanks for the solution. The problem was with matlab. You might ask why
I bother You with this, its just that I do not want to ignore mat-users.
>
>> rows_m2 = size(m2, 1);
>> m3 = permute (reshape (m2, [rows_m2, K, K]), [2, 3, 1]);
>> idx = all (isfinite (m2), 2);
>> t = cellfun (@eig, mat2cell (m3 (:, :, idx), K, K, ones(1,
>> sum(idx))),'UniformOutput', false);
>> t = [t{:}];
>> idx2 = all(t>0);
>> t = t(:,idx2) ./ [ones(K,1) * sum(t(:,idx2))];
>> t = sum (t .* log (t));
>> idx = find(idx);
>> OMEGA(idx(idx2)) = t;
>>
>> the performance increases for Octave from 82.9 to 15.2 s. Thanks.
>> (The programm slowed down on Matlab from 13.0 to 66.15 s, though).
>
> That's quite surprising. Are you sure you didn't leave the "diag"
> expression in the Matlab test? I don't see why it should get that
> slower...
>
>> I'm not sure how this technique can be used for the other functions
>> (aar, findclassifier).
>
> Maybe a different technique will work, I haven't yet looked. There are
> of course also codes that can't be vectorized.
>
>> Memory usage is also an issue.
>
> Not that much, I hope. Also note 3.1.5x does manage memory more
> efficiently, apparently even more efficiently than Matlab. Anyway
> Octave (and Matlab) is not really a good tool for memory-critical
> applications, because the COW mechanism is very ill-suited for such
> applications. You definitely want references or pointers if you need
> to keep memory low.
>
>>>> 2) Coding style:
>>>> Octave understands a superset of commands compared to matlab, and it
>>>> seems the current policy is to enforce the "octave style" and make the
>>>> use of toolboxes incompatible for a use with Matlab. Is not it sensible
>>>> to write platform-neutral applications ? Specifically, is not it in our
>>>> own interest (wider usage make the code more robust) that matlab users
>>>> are not "forced" to buy additional toolboxes but can use open source
>>>> toolboxes e.g. from octave-forge?
>>>>
>>> I'd personally consider that up to the toolboxes author. Using texinfo
>>> in the help string makes the Octave help string "nicer".... I however
>>> don't think a policy should be made that toolboxes on octave-forge
>>> should be matlab compatible..
>>>
>>
>> I know its up to the toolbox authors. I'm not sure that every author is
>> aware of this. In case someone wants to modify some functions from
>> octave-forge/main for the use with matlab, and make it available to
>> others, what is the proper procedure for this (a) if he is the original
>> author and the function is already in octave-forge/main (b) if he wants
>> to modify an existing function from some other author ?
>
> If he wants to keep that function in the package, then (obviously) he
> should follow the package's policy (determined by author or
> maintainer). If he just wants to share it on his own, then he should
> feel free to do any changes he wishes, as long as he honors the GPL.
>
>> The texinfo is the minor problem, because the function is still usable
>> even if the documentation is not properly displayed.
>> The main issues are the incompatible syntax like
>> - - comments: # vs. %
>> - - end vs. endif-endfor-endwhile-endfunction etc.,
>> - - single quote vs. double quote
>> - - negation operator: ! vs ~
>> which make it impossible to use most octave toolboxes in Matlab
>>
>> BTW, what are the arguments in favor of using octave-only coding style ?
>>
>
> comments: # is much more common. % is, AFAIK, recognized only by
> Octave and other Matlabish software and TeX.
> Also, on UNIX # allows to use the #! mechanism and thus make
> executable octave scripts.
There are all kinds of comments //, /* */, and because Shell and Octave
scripts are are two different things, this is important.
Cases using the shebang mechanism would certainly need some attention.
However, within all m-files at octave-forge, only
octave-forge/main/info-theory/doc/info-theory.m
is using the shebang mechanism.
>
> specific end blocks: they catch typing errors more easily, and the
> code is more easily parsable for both humans and computers. I also
> consider it an extremely bad idea that "end" is likewise used in index
> expressions. I think Cleve Moler (or whoever designed it) must have
> been drinking that night.
The idea of the end-operator is also used in other languages (python,
etc), so I guess it's not completely insane. After some reluctance, I
found the end-operator very useful.
>
> quotes: again, double quotes are somewhat more standard, in particular
> in the C-derived world. more importantly, ""s allow things like \n,
> \t.
Octave does not claim to be compatible to C but to Matlab. \n and \t can
be also used with single quotes in Octave as well as in Matlab.
>
> negation - this is purely syntactic sugar, AFAIK, again for
> compatibility with the C world.
This "syntactic sugar" is part of the issue and could be easily avoided.
>
>>>> 3) Scope of Octave and Octave-Forge:
>>>> Open source software has its own merit, but sometimes also other factors
>>>> (e.g. additional costs in hardware, energy supply and cooling systems,
>>>> energy efficiency = "green computing") need to be considered. Given the
>>>> fact that octave-core is currently slower for some tasks, it is worth
>>>> considering to use proprietary mat-engine. The question is whether
>>>> Octave and Octave-forge should provide support of toolboxes for matlab
>>>> users too, or whether these users should go somewhere else? What do you
>>>> think ?
>>>>
>>> I'm not sure how this point differs from your second point.. Again to me
>>> its up to the toolboxes/packages author to decide whether they want
>>> matlab compatibility or not. If a toolbox is compatible I see no issue
>>> sending matlab users to octave-forge for code..
>>>
>> Yes, the question is closely related to the previous one. Of course, if
>> the toolbox is compatible to matlab, there is no problem for the matlab
>> users. Unfortunately, most toolboxes (all in Octave and
>> octave-forge/main and most of octave-forge/extra) are using the
>> octave-only coding style.
>>
>> This seems to suggest that a fork is neccessary in order to make the
>> toolboxes applicable for matlab users. Is there an alternative ?
>>
>
> You can try to explain to the developers why making the packages
> Matlab-compatible is worth their effort.
> Maybe you'll succeed, at least with some of them.
>
It would be nice, if developers aiming at compatibility between octave
and matlab could feel at home here.
I looked also at David's suggestion to use oct2mat.
line 188: gsub("[\\]$","...");
caused this error:
awk: /home/schloegl/matlab/oct2mat/oct2mat: line 188: regular expression
compile failed (bad class -- [], [^] or [)
When I removed the line, the problem was gone. Has anyone a proper
substitute for this line?
Cheers,
Alois
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAknwMnsACgkQzSlbmAlvEIgMmgCgtXIBaHM0YjYO+VR1e1gg2dPI
taoAoJ6vGoq9zKmFIGqjnr2sWWfHDZoh
=Lace
-----END PGP SIGNATURE-----
- Re: Question on performance, coding style and competitive software, (continued)
- Re: Question on performance, coding style and competitive software, Alois Schlögl, 2009/04/23
- Re: [OctDev] Question on performance, coding style and competitive software, Jaroslav Hajek, 2009/04/23
- Re: [OctDev] Question on performance, coding style and competitive software, Przemek Klosowski, 2009/04/24
- Re: [OctDev] Question on performance, coding style and competitive software, Jaroslav Hajek, 2009/04/25
- Re: Question on performance, coding style and competitive software, David Bateman, 2009/04/23
- Re: Question on performance, coding style and competitive software, Alois Schlögl, 2009/04/24
- Re: Question on performance, coding style and competitive software, dbateman, 2009/04/24
- Re: [OctDev] Question on performance, coding style and competitive software, Søren Hauberg, 2009/04/22
- Re: [OctDev] Question on performance, coding style and competitive software, Jaroslav Hajek, 2009/04/22
- Re: [OctDev] Question on performance, coding style and competitive software, Joe Vornehm Jr., 2009/04/22
- Re: [OctDev] Question on performance, coding style and competitive software,
Alois Schlögl <=
- Re: [OctDev] Question on performance, coding style and competitive software, Jaroslav Hajek, 2009/04/23
- Re: Question on performance, coding style and competitive software, David Bateman, 2009/04/22
Re: Question on performance, coding style and competitive software, Jaroslav Hajek, 2009/04/22