[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Lzip-bug] Tarlz 0.4: Use of 'ustar' format instead of 'posix'; ques
Timothy Beryl Grahek
Re: [Lzip-bug] Tarlz 0.4: Use of 'ustar' format instead of 'posix'; question about future of Tarlz utility
Mon, 4 Jun 2018 19:41:10 -0700
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0
Thanks. I won't use the pax format then.
Upon doing some additional reading out of curiosity, I noticed the
following important paragraphs regarding the 'pax' format:
"The concept of a global extended header (/typeflag/ *g*) was
controversial. If this were applied to an archive being recorded on
magnetic tape, a few unreadable blocks at the beginning of the tape
could be a serious problem; a utility attempting to extract as many
files as possible from a damaged archive could lose a large percentage
of file header information in this case. However, if the archive were on
a reliable medium, such as a CD-ROM, the global extended header offers
considerable potential size reductions by eliminating redundant
information. Thus, the text warns against using the global method for
unreliable media and provides a method for implanting global information
in the extended header for each file, rather than in the /typeflag/ *g*
"No facility for data translation or filtering on a per-file basis is
included because the standard developers could not invent an interface
that would allow this in an efficient manner. If a filter, such as
encryption or compression, is to be applied to all the files, it is more
efficient to apply the filter to the entire archive as a single file.
The standard developers considered interfaces that would invoke a shell
script for each file going into or out of the archive, but the system
overhead in this approach was considered to be too high."
Certainly the pax format must be changed or abandoned.
Perhaps it is a good idea to contact the authors of the 'pax' format and
propose that it is worth their time to put more emphasis on data
preservation. What do you think? I am extremely willing to do this, if
you think this is possible.
Nevertheless, it is to be noted that I have discovered the GNU Tar
authors are interested in the 'pax' format according the following link:
https://www.gnu.org/software/tar/manual/html_chapter/tar_8.html But they
may have some way in mind to deal with the fact that extended records
are not protected. It seems unclear. That calls into question the
legitimacy of the 'gnu' format for data protection, since they feel
certain to abandon this format. Do you know how we will find out what
they do with files larger than 8 GB and file names longer than 256
characters? Let us hope that they keep track of data preservation.
In the event that the GNU format also seems unreliable, it may be wise
to stick with 'ustar' only; the restrictions on the format aren't
unreasonable, given the fact that it is a safe archiving format; in
other words regarding this, 8 GB in a single file and 256 characters for
a file name can be reasonably accomodated. Yet, perhaps there is another
archive format that we are overlooking; however, I wonder if a new
archive format must be invented that does not commit the oversight that
the 'pax' format seems to have committed. Whatever keeps in mind data
preservation and happens to be humanly possible is best. But I am
certainly eager to know what the future holds, regardless of the outcome.
Thank you for your time regarding this matter. Please feel free to take
your time; I know you are very busy.
Agreed. Any tar format used by tarlz must be safe by itself. Remember
that tarlz can also create uncompressed archives.
Yes, safe uncompressed archives are extremely important; otherwise, it
is best not to archive at all.