[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Octave-bug-tracker] [bug #50416] dir command is very slow with large di
From: |
Rik |
Subject: |
[Octave-bug-tracker] [bug #50416] dir command is very slow with large directories |
Date: |
Tue, 28 Feb 2017 12:34:25 -0500 (EST) |
User-agent: |
Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:43.0) Gecko/20100101 Firefox/43.0 |
Update of bug #50416 (project octave):
Priority: 5 - Normal => 3 - Low
Status: None => Confirmed
_______________________________________________________
Follow-up Comment #1:
I think the first thing to do is try to optimize the m-file, and only if
necessary convert it to C++.
The way to optimize is to run the profiler. I tried
profile on
x = dir ('/usr/bin')
profile off
profexport ('profdata')
As expected, the runtime results follow the Pareto principle or 80/20 rule
where most of the slowdown is caused by just a small number of function calls.
According to the hierarchical results in profdata
Function Total (s) Self (s) Calls
dir 3.012 0.610 1
display 0.000 0.000 1
profile 0.000 0.000 1
And if I explore the dir function itself
Function Total (s) Self (s) Calls
fullfile 1.005 0.254 2901
fileparts 0.841 0.295 2901
datenum 0.455 0.380 2901
lstat 0.033 0.033 2901
localtime 0.027 0.027 2901
strftime 0.020 0.020 2901
readdir 0.007 0.007 1
stat 0.006 0.006 578
S_ISDIR 0.002 0.002 2902
S_ISLNK 0.002 0.002 2901
binary + 0.002 0.002 5802
binary < 0.001 0.001 2902
The first three functions (1.005 + 0.841 + 0.455 = 2.301) explain 2.3/3.0 =
77% of the delay.
Looking in dir.m I see a loop (bad) that uses fullfile.
for i = 1:nf
flst{i} = fullfile (fn, flst{i});
endfor
Since the directory portion 'fn' is always being pre-pended it would be faster
to use strcat here.
flst = strcat ([fn filesep], flst);
If I do that, the run time decreases from 3.012 to 1.922 seconds or -36% which
is definitely the right direction.
This was only a quick test, you might need to actually convert 'fn' to a
proper directory using fullfile just once, and then use strcat, but the result
is basically sound.
fn = fullfile (fn, filesep);
flst = strcat ([fn filesep], flst);
After that, the next thing would be to optimize how fileparts and datenum are
used since they are the other two large slowdowns. Fileparts might be
replaced with a call to regexp; datenum might be replaced by the direct
calculation of the datenum since you know the exact format and input. You
could look inside datenum.m and see which calculation mode is actually being
used and then just perform that one.
_______________________________________________________
Reply to this item at:
<http://savannah.gnu.org/bugs/?50416>
_______________________________________________
Message sent via/by Savannah
http://savannah.gnu.org/