[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#29921: O(n^2) performance of rm -r
From: |
Jim Meyering |
Subject: |
bug#29921: O(n^2) performance of rm -r |
Date: |
Mon, 1 Jan 2018 13:07:49 -0800 |
On Mon, Jan 1, 2018 at 8:40 AM, Pádraig Brady <address@hidden> wrote:
> tag 29921 notabug
> close 29921
> stop
>
> On 31/12/17 21:07, Niklas Hambüchen wrote:
>> Hello,
>>
>> in commit
>>
>> rm -r: avoid O(n^2) performance for a directory with very many entries
>> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=24412edeaf556a
>>
>> it says that `rm -r`
>>
>> "now displays linear performance, even when operating on million-entry
>> directories on ext3 and ext4"
>>
>> I found this not to be the case on my systems running ext4 on Linux 4.9
>> on a WD 4TB spinning disk, coreutils rm 8.28.
>> As reported on https://bugs.python.org/issue32453#msg309303, I got:
>>
>> nfiles real user sys
>>
>> 100000 0.51s 0.07s 0.43s
>> 200000 2.46s 0.15s 0.89s
>> 400000 10.78s 0.26s 2.21s
>> 800000 44.72s 0.58s 6.03s
>> 1600000 180.37s 1.06s 10.70s
>>
>> Each 2x increase of number of files results in 4x increased deletion
>> time, making for a clean O(n^2) quadratic curve.
>>
>> I'm testing this with:
>>
>> set -e
>> rm -rf dirtest/
>> echo 100000 && (mkdir dirtest && cd dirtest && seq 1 100000 | xargs
>> touch) && time rm -r dirtest/
>> echo 200000 && (mkdir dirtest && cd dirtest && seq 1 200000 | xargs
>> touch) && time rm -r dirtest/
>> echo 400000 && (mkdir dirtest && cd dirtest && seq 1 400000 | xargs
>> touch) && time rm -r dirtest/
>> echo 800000 && (mkdir dirtest && cd dirtest && seq 1 800000 | xargs
>> touch) && time rm -r dirtest/
>> echo 1600000 && (mkdir dirtest && cd dirtest && seq 1 1600000 | xargs
>> touch) && time rm -r dirtest/
>>
>>
>> On another system, Ubuntu 16.04, coretuils rm 8.25, with Linux 4.10 on
>> ext4, I get:
>>
>> nfiles real user sys
>>
>> 100000 0.94s 0.06s 0.516s
>> 200000 2.94s 0.10s 1.060s
>> 400000 10.88s 0.30s 2.508s
>> 800000 34.60s 0.48s 4.912s
>> 1600000 203.87s 0.99s 11.708s
>>
>> Also quadratic.
>>
>> Same machine on XFS:
>>
>> nfiles real user sys
>>
>> 100000 3.37s 0.04s 0.98s
>> 200000 7.20s 0.06s 2.03s
>> 400000 22.52s 0.16s 5.11s
>> 800000 53.31s 0.37s 11.46s
>> 1600000 200.83s 0.76s 22.41s
>>
>> Quadratic.
>>
>> What is the complexity of `rm -r` supposed to be?
>> Was it really linear in the past as the patch describes, and this is a
>> regression?
>> Can we make it work linearly again?
>
> Note the patch mentioned above, sorts by inode order,
> which is usually a win, avoiding random access across the device.
>
> The user and sys components of the above are largely linear,
> so the increasing delay is waiting on the device.
> I presume that's coming from increased latency effects
> as the cached operations are sync'd more to the device
> with increasing numbers of operations.
> There are many Linux kernel knobs for adjusting
> caching behaviour in the io scheduler.
>
> To confirm I did a quick check on a ramdisk (/dev/shm)
> to minimize the latency effects, and that showed largely
> linear behaviour in the real column also.
>
> Note if you want to blow away a whole tree quite often,
> it may be more efficient to recreate the file system
> on a separate (loopback) device.
Thanks for the report.
As Pádraig suggests, that looks like it may well represent an
opportunity to improve cache tuning. All the same, I wanted to confirm
that the heuristic of sorting by inode number is still valuable and
ran this test:
I rebuilt "rm" with lib/fts.c changed to effectively disable that
sorting (i.e., never sort, instead of "sort when there are at least
10,000 entries"):
-# define FTS_INODE_SORT_DIR_ENTRIES_THRESHOLD 10000
+# define FTS_INODE_SORT_DIR_ENTRIES_THRESHOLD 100000000
Then I ran this test on a Fedora 27, 4.14.6-300.fc27.x86_64 system
using an ext4-formatted spinning disk (samsung HD204UI): (btw, "time"
here is GNU time, and %e prints elapsed):
$ i=50000; while :; do i=$[i*2]; case $i in 16*) break;; esac; mkdir
x; (cd x && seq $i|xargs touch); env time -f "%e $i" /x/cu/src/rm -rf
x; done
0.49 100000
16.14 200000
170.98 400000
523.55 800000
As you can see, removing that heuristic incurs a very large penalty.
Rerunning those examples with stock rm shows the quadratic behavior you noted.
$ i=100000; while :; do mkdir x; (cd x && seq $i|xargs touch); env
time -f "%e $i" env rm -rf x; case $i in 8*) break;; esac; i=$[i*2];
done
0.48 100000
2.59 200000
12.69 400000
35.61 800000
$ i=100000; while :; do mkdir x; (cd x && seq $i|xargs touch); env
time -f "%e $i" env rm -rf x; case $i in 8*) break;; esac; i=$[i*2];
done
0.48 100000
2.52 200000
11.53 400000
35.86 800000
$ i=100000; while :; do mkdir x; (cd x && seq $i|xargs touch); env
time -f "%e $i" env rm -rf x; case $i in 8*) break;; esac; i=$[i*2];
done
0.48 100000
2.55 200000
11.49 400000
38.86 800000
Even on an SSD (same system as above, with a 3-yr-old SSD), I see a
super-linear trend of O(N^1.5):
$ i=100000; while :; do mkdir x; (cd x && seq $i|xargs touch); env
time -f "%e $i" env rm -rf x; case $i in 32*) break;; esac; i=$[i*2];
done
0.48 100000
1.26 200000
3.90 400000
10.56 800000
28.13 1600000
58.39 3200000
Note also that there is a test (tests/rm/ext3-perf.sh) that is
intended to cover this situation. However, back then, I made the error
of using an absolute threshold, and with subsequent general system
performance increases, it has become next to useless. We should
probably invest in updating that "expensive" test to use relative
comparisons, as done in grep's tests/long-pattern-perf
(https://git.savannah.gnu.org/cgit/grep.git/tree/tests/long-pattern-perf)
and tests/mb-non-UTF8-performance.
Yes, I was measuring elapsed wall-clock time back when I made those
changes to rm and fts. So I do agree that this is a problem, but I
doubt we can do anything in the coreutils to fix it.