bug-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-tar] Recursive Archival


From: TRANS
Subject: [Bug-tar] Recursive Archival
Date: Wed, 19 Jul 2006 11:41:25 -0400

Hello,

I recently came up with a new idea for an archive format that I call
"recursive archival" or "inverse tar". I created a simple
implementation of this in Ruby called Rock. (If you like you can read
about it here: http://rock.rubyforge.org/overview.html). Originally I
had some additional ideas about using the format for Ruby packages.
But on further consideration, I think the idea is too "fundamental" to
justify such a project and really would make more sense as simply an
additional feature of tar itself.

The format is simple. Lets label the files it creates .rtar to
distinguish them from normal .tar files (perhaps an internal marker
would do that instead in actual implementation?). So given an example
directory:

 mydir/
   foo.txt
   subdir/
     bar.txt

Archiving it inverserly/recursively would give:

 mydir.rtar/
   foo.txt
   subdir.rtar/
     bar.txt

I use / here to mean, "rtar file contains". So you can see what I mean
by recursive archive. Every file and subdir is first archive before
becoming part of its' parent directory's archive. Adding compression
is even better:

 mydir.rtar.gz/
   foo.txt.gz
   subdir.rtar.gz/
     bar.txt.gz

I did some basic analysis and compression sizes are often slightly
better than regular tar.gz files after about 100K. I was also
surprised to see that speed difference in decompression wasn't as
great as I expected --it performs fairly well.

The good thing about this format is that it allows selective
decompression of the contents --what I call "cherry picking" the
archive. For instance, if a directory had many subdirs, one need only
decompress one of them to get to files within it --not the whole
archive. At the very least this has application in software packaging
where package metadata could be extracted independent of source code
--that was my usecase and why I thought of it in the first place.

Giving that this format is really just a "rearrangement" of a
traditional tar, it seems to me it would be best simply as an added
feature of tar itself rather than a whole separate entity. Do you
think support for recursive layout could be added to tar?

Thanks,
Trans.

P.S. I wouldn't be surprised if someone came up with this idea before.
But doing some basic googling did not turned anything up.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]