help-tar
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-tar] --listed-incremental and tape full (tar 1.20 for now)


From: Jakob Bohm
Subject: Re: [Help-tar] --listed-incremental and tape full (tar 1.20 for now)
Date: Thu, 30 Dec 2010 17:29:41 +0100
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.2.13) Gecko/20101207 Thunderbird/3.1.7

(Sorry for the even longer quote, but this details some useful experiments that might become regression tests for a solution)

On 30-12-2010 00:55, Ersek, Laszlo wrote:
(Sorry for the long quote, I'd like to keep the context.)

On Wed, 29 Dec 2010, Jakob Bohm wrote:

I am having a problem with the behavior of --listed-incremental when the tape of a single-volume archive fills up.

To avoid the backups on subsequent days becoming increasingly larger, I want the incremental index to list the files that were actually backed up before the tape was full. However currently, I seem to get an index saying that nothing was backed up on the full tape (its uncompressed LTO-4 = 800Gio per tape).

One thing I have not tried (because it would create a multi-volume archive with the second and later volumes missing) would be to specify "--tape-length 800000000000 --info-script=/bin/false" however given the observed handling of tape full I would suspect that doing this would have the same result as a simple broken output pipe[2].

1. This is on a Debian 5.0.x(Lenny) system with its corresponding version of GNU tar 1.20, but compiling another tar version to make things work would not be much of a problem.

2. I am piping the output from tar through a double-buffering program similar to the classic "buffer" program in order to reduce shoe-shining in the tape drive caused by the disk being slower than the tape. The buffer is configured to buffer up about 792 MibiOctets of archive at a time, write lots of 100Kio tape blocks continuously in a burst at tape speed, then sleep until the buffer is full again. Thus tape full technically manifests itself as a broken output pipe rather than a real disk full error code in case that makes any difference to tar.

3. This is a production system, I really need the backup to work 5 days a week, 52 weeks/year. As each backup run may take up to 18 hours (2 hours lead time + 1 minute/burst), I cannot do much full scale experimenting.

4. This is all done by a cron job, no human interaction possible.


I tried to check the situation that I think you would be in if the disk could keep up with the tape:

- --listed-incremental
- tape full signalled with -1/ENOSPC on write()
- single volume

I created 10 files of '\0', 2 MB each (zf0 .. zf9). I created a 5.5 MB ext2 filesystem image and loop-mounted it (/mnt/tmp). Then I tried to create the single-volume archive, without any pre-existing metadata file.

  $ rpm -q tar
  tar-1.23-7.fc14.x86_64

  $ ls -goh zf? img
  -rw-------. 1 5.5M 2010-12-29 23:53:21 +0100 img
  -rw-------. 1 2.0M 2010-12-29 23:43:13 +0100 zf0
  -rw-------. 1 2.0M 2010-12-29 23:43:13 +0100 zf1
  -rw-------. 1 2.0M 2010-12-29 23:43:13 +0100 zf2
  -rw-------. 1 2.0M 2010-12-29 23:48:02 +0100 zf3
  -rw-------. 1 2.0M 2010-12-29 23:48:02 +0100 zf4
  -rw-------. 1 2.0M 2010-12-29 23:48:02 +0100 zf5
  -rw-------. 1 2.0M 2010-12-29 23:52:00 +0100 zf6
  -rw-------. 1 2.0M 2010-12-29 23:52:00 +0100 zf7
  -rw-------. 1 2.0M 2010-12-29 23:52:00 +0100 zf8
  -rw-------. 1 2.0M 2010-12-29 23:52:00 +0100 zf9

  $ tar -c -v -f /mnt/tmp/z.tar --listed-incremental=z.snar zf*
  zf0
  zf1
  zf2
  tar: /mnt/tmp/z.tar: Wrote only 2048 of 10240 bytes
  tar: Error is not recoverable: exiting now

  $ ls -goh z.snar
  -rw-------. 1 0 2010-12-30 00:04:15 +0100 z.snar

That is, without -M, the level 0 backup fails when it encounters ENOSPC and the metadata file remains empty, even though two files were written.

I removed the partial tar file and the empty snar file, and repeated the above with -M. At each prompt I simply removed the last volume.

  $ tar -M -c -v -f /mnt/tmp/z.tar --listed-incremental=z.snar zf*
  zf0
  zf1
  zf2
  Prepare volume #2 for `/mnt/tmp/z.tar' and hit return:
  ./GNUFileParts.10717/zf2.1
  zf3
  zf4
  zf5
  Prepare volume #3 for `/mnt/tmp/z.tar' and hit return:
  ./GNUFileParts.10717/zf5.2
  zf6
  zf7
  Prepare volume #4 for `/mnt/tmp/z.tar' and hit return:
  ./GNUFileParts.10717/zf7.3
  zf8
  zf9

  $ ls -goh z.snar
  -rw-------. 1 36 2010-12-30 00:08:56 +0100 z.snar

I did this in order to end up with a complete metadata file. Now I tried to update it, again by backing up to a single-volume file (level 1):

  $ rm /mnt/tmp/z.tar
  $ touch zf[5-8]
  $ cp z.snar z.snar.bak

  $ tar -c -v -f /mnt/tmp/z1.tar --listed-incremental=z.snar zf*
  zf5
  zf6
  zf7
  tar: /mnt/tmp/z1.tar: Wrote only 2048 of 10240 bytes
  tar: Error is not recoverable: exiting now

  $ cmp z.snar z.snar.bak

Thus the snar file was not updated, even though two files were archived. I removed the partial single-volume level 1 archive and retried the above (against the unchanged snar file) by backing up to a multi-volume level 1 archive. Again, I "changed" volumes by "rm".

  $ tar -M -c -v -f /mnt/tmp/z1.tar --listed-incremental=z.snar zf*
  zf5
  zf6
  zf7
  Prepare volume #2 for `/mnt/tmp/z1.tar' and hit return:
  ./GNUFileParts.10767/zf7.1
  zf8

  $ cmp z.snar.bak  z.snar
  z.snar.bak z.snar differ: char 23, line 2

This time the metadata file was updated correctly.

Until this point, I believe, this experiment has shown that without -M, the snar file won't be updated if tar runs out of space, even if you omit the buffering application and write directly to the tape. So adding --tape-length=XXXX (which implies -M) can't put you in a worse situation than you're presently in. The question is whether passing -M and then refusing to provide blank media would create a usable snar file (for level 0) or update it (for higher levels):

  $ rm z.snar* /mnt/tmp/z1.tar

  $ tar -M -c -v -f /mnt/tmp/z.tar --listed-incremental=z.snar zf*
  zf0
  zf1
  zf2
  Prepare volume #2 for `/mnt/tmp/z.tar' and hit return: q
  tar: No new volume; exiting.

  tar: WARNING: Archive is incomplete
  tar: Error is not recoverable: exiting now

  $ ls -goh z.snar
  -rw-------. 1 0 2010-12-30 00:21:39 +0100 z.snar

The answer is negative. You really need to complete the entire backup process to get a usable index, independently from whether you write to a tape or a pipe, and from single/multi volume.

Ok, I suspected this.

For future tests with simulated buffering, just run "tar -b 20 -f - [options and args] | dd bs=10240 count=[somesmallsize] of=/dev/null"

I don't know about the structure of the snar file, but if it contains a single timestamp (for example the start or the end of the most recent backup, or the highest mtime encountered during backup), then it really may be updated only after all files were backed up. Otherwise it could advance past the mtime of a file tar intended to back up but missed because there was not enough space.

At least in 1.20, the .snar file is a list of individual file names and some related timestamps, simply omitting some file names tells the next tar run that those files are not backed up yet, I have tested that and actually manipulate .snar files to provoke this effect.

So writing a .snar file indicating that only some files were (completely) backed up should be trivial. Timestamps for incompletely processed dirs should be written as (real mtime minus 2 seconds) in the .snar file so the next run will reenumerate the dir and back up its metadata again.

(I never found myself in the vicinity of a tape drive; caveat emptor.)

How unusual, not even in a museum?

Anyway, LTO drives are very much like the 1/2" cartridges used when GNU was young (and like the big old Ampex/Tandberg 1/2" tape drives used as movie props), except for some minor details (like Quantum buying up the rights to the old DLT format, so the rest of the industry had to make a new slightly different format called LT Open, and then the larger capacity of cause).

They have variable length physical blocks on the tape and generally support all of the mt(1) operations in actual hardware, using the Linux kernels (or other free/non-free OS) default "st" driver. Speed is high enough to max out its own dedicated SCSI adapter. Hardware programming documentation boils down to one sentence "this drive is compliant with the standard SCSI command set for tape drives".

Once recorded, a tape cartridge is a robust square lump of plastic that can withstand more abuse than a portable hard drive. Some manufacturers promise 30 years of data retention, meaning that a tape written in 1980 using the first BSD UNIX should still be readable if you can find a compatible tape drive and hook it to a current GNU system. Tandberg is still a player in this market. DEC sold their tape division to Quantum years before becoming part of HP (which has its own tape division), so to read an old DEC tape, get the proper drive from Quantum.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]