rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] feature requests and notes


From: Andrew K. Bressen
Subject: Re: [rdiff-backup-users] feature requests and notes
Date: Sat, 01 Nov 2003 23:30:12 -0500
User-agent: Gnus/5.1002 (Gnus v5.10.2) XEmacs/21.4 (Informed Management, darwin)

Ben Escoto <address@hidden> writes:
> Yes I see the duplication of effort but not how to eliminate it.  What
> do you have in mind?  Wouldn't trying to put this into rdiff-backup
> just make things more complicated and error-prone?

I was thinking of adding a bit more stuff to the metadata file, adding
an option to rdiff-backup to place a second copy of the metadata file
in some specified location to be used for the new functionality, and a
set of CLI utilities to perform operations on the spare metadata file.

Looking at an rdiff-backup metadata file, I see filetype, size, 
modtime, uid, gid, mode (permissions), and device number (if any) 
being stored. I assume if I was running 0.13.x and had a filesystem
that supported ACL's and other EA's, that they'd be in there too,
along with inode number and ctime. 

I'm assuming rdiff-backup doesn't store atime or any hashes of the files. 

A tripwire or integrit database is basically one of those metadata
files with a bit more stuff in it: atime and several different hashes
of the file. So, if those things get added, to make a file integrity
checker, all that's needed is a program that compares the actual
filesystem to the metadata file and reports differences. It does need
a bit of logic to deal with a list of allowed differences it shouldn't
complain about.

To make a file locator requires a program that searches the
metadata file, and which perhaps is setuid and can tell which files
the invoking user is allowed to see info about. Making it able to act
substantially like find(1) but with less heinous syntax is nice.

To make a duplicate file finder, sort the file on a hash
and report doubles. Make an option to exclude 0 length files. 

If these programs can deal with multiple metadata files, that's nice. 
Then one could do a duplicate file find or locate across multiple systems
easily. 

I'm assuming that people writing this stuff could reuse some 
of rdiff-backup's modules/objects/code but that they would be seperate 
programs so as to minimize the complexity/error-proneness impact on
rdiff-backup itself. I suspect rdiff-backup would be the most efficient
place to put the hash generation since it has to read everything at 
some point anyway, but if this seems like a performance or complexity 
issue then perhaps the hashmaking could be done seperately from the backup 
pass and/or live in a different utility. 

  --akb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]