[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-tar] Making recovery from corrupted tarballs more reliable
From: |
Marc Aurele La France |
Subject: |
[Bug-tar] Making recovery from corrupted tarballs more reliable |
Date: |
Thu, 28 Oct 2004 10:33:28 -0600 (MDT) |
Hi.
I was recently faced with the task of recovering as much as I could out of a
corrupted tarball of some 600GB in size. Attached are changes to GNU tar that
greatly facilitated this process. The changes are described below. Do with
these as you see fit.
The changes are not the full story however. I ended up writing a small utility
that looked for unaligned headers before piping the stream to tar. I have not
spent the time to incorporate this later functionality into GNU tar.
There is also the matter that there is no guarantee that the contents of all
files I ended up with are actually valid, given that the tar format does not
allow for the checksumming of file data blocks.
The changes were originally developed against GNU tar 1.14, and only affect
list.c. I have verified that they apply cleanly to today's cvs HEAD at
Savannah.
The changes affect the following:
1) I found read_header() to be much too lax about what it considers to be a
valid header before making decisions based on that header. The changes
cause read_header() to return HEADER_FAILURE early on (after the
HEADER_ZERO_BLOCK check) if one of two additional checks fails:
a) The header must contain a valid magic field. Potential breakage here is
that this check assumes the magic field is a null string in V7 tarballs.
b) The format of the header's chksum field is as produced by GNU tar.
Potential breakage here, of course, is that other tar's might produce a
differently formatted chksum.
2) The remainder of this diff affects the --block-number option:
a) Print the block number on the same unit (stdlis or stderr) as the message
to be thus prefixed;
b) Prefix the block number to more of list.c's messages;
c) Change the block number printed to be the number of the very first block
relevant to the current filename, rather than the number of the block
current when the message is produced.
The only "odd" behaviour I've noticed so far with these --block-number
changes is an occasional double prefix when stdout & stderr are directed to
the same unit.
Please reply should you have concerns regarding these changes.
Thanks.
Marc.
+----------------------------------+-----------------------------------+
| Marc Aurele La France | work: 1-780-492-9310 |
| Computing and Network Services | fax: 1-780-492-1729 |
| 352 General Services Building | email: address@hidden |
| University of Alberta +-----------------------------------+
| Edmonton, Alberta | |
| T6G 2H1 | Standard disclaimers apply |
| CANADA | |
+----------------------------------+-----------------------------------+
XFree86 developer and VP. ATI driver and X server internals.
tar-20041028.diff.gz
Description: Binary data
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Bug-tar] Making recovery from corrupted tarballs more reliable,
Marc Aurele La France <=