bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Regression on file deduplication with -T


From: Dominique Martinet
Subject: Regression on file deduplication with -T
Date: Mon, 15 Jun 2020 14:49:16 +0200

Hi,

We've got interesting users that run a find to build a list of files
(without filtering directories), then create a tar archive from that
list (without --no-recursion).

They noticed that after upgrading from 1.26 to 1.30 (el7->el8), the size
of the generated archive got much much bigger.

I told them to add --no-recursion, because, well, obviously that'll
help, but I believe this is a bug as -T (--files-from) says "The names
read are handled the same way as command line arguments." and that
apparently isn't the case here (see final example).


Romain (in cc) bisected this to 26538c9bfc5fd ("Reduce memory consuption
when handling the -T option."), so since 1.27 exactly.
Thanks!


Here's a trivial reproducer:
$ mkdir -p A/B/C/D
$ truncate -s 10M A/B/C/D/foo
$ find A > list
$ tar cf tar -T list


With tar 1.26 (el7):
$ tar tvf tar
drwxrwxr-x user/grp    0 2020-06-15 10:39 A/
drwxrwxr-x user/grp    0 2020-06-15 10:39 A/B/
drwxrwxr-x user/grp    0 2020-06-15 10:39 A/B/C/
drwxrwxr-x user/grp    0 2020-06-15 10:40 A/B/C/D/
-rw-rw-r-- user/grp 10485760 2020-06-15 10:40 A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:39 A/B/
drwxrwxr-x user/grp        0 2020-06-15 10:39 A/B/C/
drwxrwxr-x user/grp        0 2020-06-15 10:40 A/B/C/D/
hrw-rw-r-- user/grp        0 2020-06-15 10:40 A/B/C/D/foo link to A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:39 A/B/C/
drwxrwxr-x user/grp        0 2020-06-15 10:40 A/B/C/D/
hrw-rw-r-- user/grp        0 2020-06-15 10:40 A/B/C/D/foo link to A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:40 A/B/C/D/
hrw-rw-r-- user/grp        0 2020-06-15 10:40 A/B/C/D/foo link to A/B/C/D/foo
hrw-rw-r-- user/grp        0 2020-06-15 10:40 A/B/C/D/foo link to A/B/C/D/foo

With tar 1.30 (el8) or 1.32 (fedora32):
$ tar tvf tar
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/B/
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/B/C/
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/B/C/D/
-rw-rw-r-- user/grp 10485760 2020-06-15 10:12 A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/D/
-rw-rw-r-- user/grp 10485760 2020-06-15 10:12 A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/D/
-rw-rw-r-- user/grp 10485760 2020-06-15 10:12 A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/D/
-rw-rw-r-- user/grp 10485760 2020-06-15 10:12 A/B/C/D/foo
-rw-rw-r-- user/grp 10485760 2020-06-15 10:12 A/B/C/D/foo

For comparison, passing arguments does work on newer tar as well, e.g.
$ tar cf tar A A/B A/B/C A/B/C/D A/B/C/D/foo 
$ tar tvf tar
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/B/
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/B/C/
drwxrwxr-x user/grp 0 2020-06-15 10:21 A/B/C/D/
-rw-rw-r-- user/grp 10485760 2020-06-15 10:12 A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/D/
hrw-rw-r-- user/grp        0 2020-06-15 10:12 A/B/C/D/foo link to A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/D/
hrw-rw-r-- user/grp        0 2020-06-15 10:12 A/B/C/D/foo link to A/B/C/D/foo
drwxrwxr-x user/grp        0 2020-06-15 10:21 A/B/C/D/
hrw-rw-r-- user/grp        0 2020-06-15 10:12 A/B/C/D/foo link to A/B/C/D/foo
hrw-rw-r-- user/grp        0 2020-06-15 10:12 A/B/C/D/foo link to A/B/C/D/foo


Cheers,
-- 
Dominique Martinet



reply via email to

[Prev in Thread] Current Thread [Next in Thread]