[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Speed up 'unique'
From: |
Jaroslav Hajek |
Subject: |
Re: Speed up 'unique' |
Date: |
Thu, 18 Dec 2008 20:45:09 +0100 |
On Thu, Dec 18, 2008 at 1:12 PM, Daniel J Sebald <address@hidden> wrote:
> Jaroslav Hajek wrote:
>>
>> On Wed, Dec 17, 2008 at 11:52 AM, Daniel J Sebald
>> <address@hidden> wrote:
>>
>>> Attached is a patch that tweaks unique.m to cut the time of the
>>> following test by 4:
>>>
>>> function testfunc3(m, varargin)
>>> x = [[1:20] [20:-1:1]];
>>> tstart = cputime();
>>> for i=1:5000
>>> [y i j] = unique(x);
>>> end
>>> cputime() - tstart
>>> end
>>>
>>> The reason for the improvement is that all the testing for the options
>>> is avoided if no options are present. (I.e., avoids three calls to
>>> 'strmatch' and a couple conditionals.) Also, 'prepad' really isn't a
>>> necessary call, as I see it.
>>>
>>> If one looks closely at the handling of options, I added a recursive
>>> call to make sure the options are unique. Otherwise, something like
>>> this fails:
>>>
>>> unique([1:10], 'first', 'first')
>>>
>>> The recursion looks funny, but because of the added check on the number
>>> of arguments, it is not an indefinite loop.
>>>
>>> Dan
>>>
>>
>>
>> I don't object to this change; but if you intend to use `unique' as a
>> time-critical function, perhaps rewriting it in C++ would be even
>> better...?
>
> Could do that. In any case, the convention in most scripts seems to be
> checking nargin before testing options. It's cleaner that way.
>
> All routines are time critical when working with big data files.
Well, my experience is close to the opposite. When data grows large,
in a real application, usually only a few spots profile as
bottlenecks.
I'd restate this as "all routines *may* be time-critical". In any
case, I'd tend to agree that any of Octave's built-in or library
functions are worth optimizing.
> Always
> program as efficiently as possible. Basically, Octave/Matlab isn't too much
> of a loss over C if done efficiently, but extaneous string tests really take
> a hit.
>
Here, I think the main problem is that the strmatch function used to
parse the options is an m-function. Although Octave's library
m-functions bypass (by default) timestamp checks, calling an
m-function is significantly slower than calling a built-in or dld
function.
Anyway, I applied this patch.
thanks
--
RNDr. Jaroslav Hajek
computing expert
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz