[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
du counting files in hard-linked directories twice
From: |
Michael Schwarz |
Subject: |
du counting files in hard-linked directories twice |
Date: |
Thu, 27 Aug 2009 18:09:11 +0200 |
Hello everybody
The Problem I see as a bug is that, in some circumstances, du is
counting the same files multiple times when they appear inside of hard-
linked directories.
I have tested this with GNU coreutils 7.4 (MacPorts) and 7.5 (vanilla)
on Mac OS X 10.5.8 on x86-64. I did all tests on an HFS+ volume,
formatted as being non-cases-sensitive.
Judging from du.c, the problem seems to be that du only inserts a file
into it's table of visited files if it's link count is greater than
one (du.c, Lines 513-517):
if (skip
|| (!opt_count_all
&& ! S_ISDIR (sb->st_mode)
&& 1 < sb->st_nlink
&& ! hash_ins (sb->st_ino, sb->st_dev)))
Replacing the 1 in the condition above with a 0 solves the problem,
although it would probably be a much better idea to insert hard-linked
directories into the hash table instead. This would also prevent du
from counting the size of directory inodes twice.
Appended is a simple script which makes three tests:
* One where there is a single file with two links.
* One where there is a directory with two links containing one file
* One where there is a directory with two links containing two links
to a single file.
After the script is it's output on my machine and the output of ls -
ilR for reference.
With all the tests there is exactly one file, 1 MiB in size and only
for the second test, du reports a total size of 2 MiB.
I hope you agree that this is a bug in du and that I did not miss a
problem on my side in evaluating this. I will gladly be of any help I
can on fixing this problem, e.g. testing patches for people who do not
have a setup that may be used to reproduce this problem.
Thank you
Michael
$ cat test.sh
#! /usr/bin/env bash
set -e -o pipefail
# Test with a single file with two links.
(
echo 'Test with file hard links:'
rm -rf 'test-d1-f2'
mkdir 'test-d1-f2'
cd 'test-d1-f2'
mkdir 'top-1'
mkdir 'top-2'
dd bs=1k count=1k < /dev/zero > 'top-1/file' 2> /dev/null
link 'top-1/file' 'top-2/file'
du -h
echo
)
# Test with a directory with two links containing a single file with
one link.
(
echo 'Test with directory hard links:'
rm -rf 'test-d2-f1'
mkdir 'test-d2-f1'
cd 'test-d2-f1'
# Because of a restriction in HFS+ or the Mac OS, we need to put the
two directory entries for the hard-linked directory into sepparate
directories.
mkdir 'top-1'
mkdir 'top-2'
mkdir 'top-1/dir'
link 'top-1/dir' 'top-2/dir'
dd bs=1k count=1k < /dev/zero > 'top-1/dir/file' 2> /dev/null
du -h
echo
)
# Test with a directory with two links containing a single file with
two links.
(
echo 'Test with file and directory hard links:'
rm -rf 'test-d2-f2'
mkdir 'test-d2-f2'
cd 'test-d2-f2'
mkdir 'top-1'
mkdir 'top-2'
mkdir 'top-1/dir'
link 'top-1/dir' 'top-2/dir'
dd bs=1k count=1k < /dev/zero > 'top-1/dir/file-1' 2> /dev/null
link 'top-1/dir/file-1' 'top-1/dir/file-2'
du -h
echo
)
$ ./test.sh
Test with file hard links:
1.0M ./top-1
0 ./top-2
1.0M .
Test with directory hard links:
1.0M ./top-1/dir
1.0M ./top-1
1.0M ./top-2/dir
1.0M ./top-2
2.0M .
Test with file and directory hard links:
1.0M ./top-1/dir
1.0M ./top-1
0 ./top-2/dir
0 ./top-2
1.0M .
$ ls -ilR
.:
total 4
12382224 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 test-d1-f2
12382230 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 test-d2-f1
12382237 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 test-d2-f2
12382194 -rwxr-xr-x 1 michi michi 1207 Aug 27 18:06 test.sh
./test-d1-f2:
total 0
12382225 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382226 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2
./test-d1-f2/top-1:
total 1024
12382227 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file
./test-d1-f2/top-2:
total 1024
12382227 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file
./test-d2-f1:
total 0
12382231 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382232 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2
./test-d2-f1/top-1:
total 0
12382233 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 dir
./test-d2-f1/top-1/dir:
total 1024
12382236 -rw-r--r-- 1 michi michi 1048576 Aug 27 18:06 file
./test-d2-f1/top-2:
total 0
12382233 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 dir
./test-d2-f1/top-2/dir:
total 1024
12382236 -rw-r--r-- 1 michi michi 1048576 Aug 27 18:06 file
./test-d2-f2:
total 0
12382238 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-1
12382239 drwxr-xr-x 3 michi michi 102 Aug 27 18:06 top-2
./test-d2-f2/top-1:
total 0
12382240 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 dir
./test-d2-f2/top-1/dir:
total 2048
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-1
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-2
./test-d2-f2/top-2:
total 0
12382240 drwxr-xr-x 4 michi michi 136 Aug 27 18:06 dir
./test-d2-f2/top-2/dir:
total 2048
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-1
12382243 -rw-r--r-- 2 michi michi 1048576 Aug 27 18:06 file-2
$
- du counting files in hard-linked directories twice,
Michael Schwarz <=