rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Proposal to fix long filenames


From: Chris Wilson
Subject: Re: [rdiff-backup-users] Proposal to fix long filenames
Date: Sat, 12 Nov 2005 12:25:09 +0000 (GMT)

Hi Ben,

Time to fix rdiff-backup's oldest bug?  In this message I'll describe
the problem and one way to fix it.

Great idea! Just a few small points:

There are three different ways I can see this happening:
[...]
3)  The source filesystem supports longer filenames than the
   destination.  In this case the mirror file may be too long to
   write even without any quoting.  I've never heard of this actually
   happening.

There is a fourth case: where the destination path is deeper into the destination filesystem than the source path is. For example, I backup many machines root directories (/) into /mnt/backup/<machine-name>/rdiff on my backup servers. In this case, both the original filename and the increments may be too long to back up.

The increments in the rdiff-backup-data directory also have "rdiff-backup-data/" prepended to the name. In this case, the increment names may be too long.

The mirror_metadata file could have two additional optional fields,
called "MirrorFilename" and "IncrementFilename".  If MirrorFilename is
set, rdiff-backup reads the mirror file from the
rdiff-backup-data/long_filename_data/<mirror filename> file, instead
of from the normal location in the mirror directory.

The MirrorFilename seems like a good idea in principle, but it means that the mirror files are not located in their usual place in the mirror filesystem. I don't think that's a good thing, as it makes it significantly harder to examine or restore the latest version "by hand", and compute the disk space used by it.

Similarly, if IncrementFilename is set, increment data will not be
read from rdiff-backup-data/increments/<whatever>.<suffix> but from
rdiff-backup-data/long_filename_data/<increment filename>.<suffix>

Similarly, it means that some increments are not where we expect them to be.

The alternate filenames would have boring but plentiful names like
1, 2, etc.

I'd like to propose a compromise:

rdiff-backup figures out the longest possible filename and deepest possible path for itself when examining the filesystem capabilities.

If, during backup, any path or file to be written to the destination exceeds those lengths, it's terminated near the maximum length, and a number appended. The relevant IncrementFilename or MirrorFilename directive is written to the metadata at the same time. So for example:


        /a/really/long/path/on/a/short/path/file/system

might become

        /a/really/long/path/on/a/short/path/file/s~1

and if the filesystem's limits are so short that directories must be renamed as well, then keep at least the first character of each one:

        /a/really/long/p~1/on/a/s~1/p~1/f~1/s~1

and if that's not enough, then just replace the directory names with numbers:

        /1/1/1/1/1/1/1/1/1/1

and if that's not enough, I don't know what else you can do! :-) Shoot the admin, perhaps.

Originally I thought that the fix for long filenames might somehow be
integrated into a scheme to detect and compress renamed files.  But
now I doubt any renaming scheme is forthcoming.

That's a pity, since I think it would now be really easy: just make a hash table of the SHA-1 checksums in the mirror, and compare the checksum of each newly added file to this list, to see if it's a duplicate or a moved file. This shortcuts the need to transfer the file again.

Cheers, Chris.
--
_ ___ __     _
 / __/ / ,__(_)_  | Chris Wilson <0000 at qwirx.com> - Cambs UK |
/ (_/ ,\/ _/ /_ \ | Security/C/C++/Java/Perl/SQL/HTML Developer |
\ _/_/_/_//_/___/ | We are GNU-free your mind-and your software |





reply via email to

[Prev in Thread] Current Thread [Next in Thread]