Hi Aiyion,
Am 22.07.2022 um 10:13 schrieb Aiyion.Prime <help-tar@aiyionpri.me>:
Good morning everyone,
I thought I knew my way around tar for a few years now, but learned I'm wrong
about that yesterday evening:
I'm archiving a directory-structure, that does contain large redundant files.
onepath/readme
onepath/binaryblob13
anotherpath/readme
anotherpath/binaryblob13
I don't know your complete workflow, hence I can give only a vague idea:
Assuming you are using symlinks in the above structure:
• instead of archiving the complete directories recursively, create a list of
files to be saved for `tar`: first all symlinks (as symlinks), then all real
files
• on extraction --occurrence=1 will stop at the first encounter
• in case it's a symlink, remove the extracted symlink file and extract the
real file it points to with the name of the symlink file
This should speed up the processing.
-- Reuti
I cannot change the pathing, as this is to be fed to a packagemanager, that
requires it.
What I thought I could do, to not have an archive twice the size of
`binaryblob13`, was to use sym- or hardlinks and the `-h` flag for creation.
So archiving this:
onepath/
secondpath -> onepath/
using
tar --sort=name --owner=0 --group=0 --numeric-owner -chvf normal_sized.tar
secondpath onepath ${mtime})
That would work like a charm if said packagemanger would extract the whole
tarfile.
This is what it does though:
tar xf $tar_file secondpath/binaryblob13
And that works fine if I extract files from the directory first referenced in
the creation command (in the case above secondpath)
but returns an error for the latter directory I archived, as it tries to create
a hardlink on disk pointing to what would've been the former extracted file. As
it does not exist I've got a problem.
I'd like to avoid extracting all binaryblob13 references beforehand only to
have the link I extract point to something valid.
Is there a flag to tell tar "I dont care if you have to seacrh the archive twice,
but extract the original file instead of creating an (invalid) hardlink"?
I realize thats unuseable for actual tape-records, but maybe someone has a hint
for me here.
Thanks in advance and have a nice morning,
Aiyion