rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] Wiki additions...V2


From: listserv . traffic
Subject: [rdiff-backup-users] Wiki additions...V2
Date: Thu, 26 Mar 2009 15:55:27 -0700

Update: Re-read what I wrote - a few changes.
Also, included some notes that I didn't intend to include.

If you intend to proof, proof this second version...

---

Question:
What happens when I have to restore a file? 

Answer:
If the file hasn't changed between current and the version of the file you 
want, then you simply need to copy the current file. Nothing more has to be 
done.

However, if the file has changed, it will take the current version of the file 
and use the meta-data stored to tell it how to apply all the reverse 
differences files that apply back to the date you requested, provided it 
exists. 

For RDiff-Backup to be successful in restoring the file, it has to have all 
three parts: 
1) The current version of the file as it existed when RDiff-Backup was last 
run. 
2) The meta data that tells the system if/when/how to apply the reverse diffs.
3) All the reverse diffs themselves. 

Reverse diffs have to be applied in the reverse order they were made. (i.e. The 
newest RDiff is applied, then the next oldest and so on, back to the very 
oldest reverse diff that applies to the version of the file requested.)

Question: 
Does the system have to restore all the reverse diffs for a file? What if there 
are dozens or even hundreds? 
What if only one is broken, is the whole process of "restoring" the file broken?

Answer:
Yes, the system has to apply all the reverse diffs that apply to the "version" 
of the file you requested. If there were 200 reverse diffs, because the file 
had changed over 200 rdiff-backup sessions, yes it will have to apply all 200 
reverse diffs to get to the version of the file you want. If any of the three 
parts of the system, current file, meta-data, or reverse diffs are missing, the 
process will break, and you won't get your file. 

(There are ways to attempt to manually salvage the file, but these are far 
outside the scope of this document. Suffice it to say, that if any of the parts 
(file/meta-data/rdiffs) needed are missing, RDiff-Backup isn't going to be able 
to restore it automatically and all bets are off. You'll be in deep weeds and 
if you're lucky you might be able to get parts of your data back. Perhaps if 
you're really super lucky and the missing reverse diffs overlap others *and* 
you can finagle the restore process, you might get everything back. Or, if it's 
just not your day, you won't get jack, you'll get fired, your dog will bite you 
and you'll get rabies...)

Question:
Isn't it dangerous to have to rely on all those reverse diffs, especially when 
they're being applied serially, and every single one of the reverse diffs has 
to apply properly and in order to get back to the version I want?

Answer:
Yes, it is "dangerous" - though every definition of dangerous depends on your 
perspective. (Just ask a BASE jumper about what's considered dangerous.) The 
design decision was to only keep the differences with no intermediate snapshots 
of files. (Also, due to limitations in the rsync libraries, it's impossible to 
merge rdiffs which might allow us to reduce the number of independant reverse 
diffs we have to apply.) 

While we're certainly not trying to convince you to use RDiff-Backup and agree 
with our reasoning on what's best and/or reasonable, we think reasonable 
trade-offs were made on managing the resources used vs the advantages of 
redundancy.

Question:
OK, I like most of what I hear, but how can I be sure the whole system retains 
it's integrity? Is there a way to test all the parts of the system and make 
sure they all work, and work properly. For example, can I have the system "self 
test" the archive and let me know if any parts of it fail.

Answer: 
Certainly. The "--verify-at-time xyz" switch is your friend. This switch, in 
essence does a full restore and check of the file to the time specified in 
"xyz." In brief, it takes the current version of the file, and then uses the 
meta-data and applicable reverse diffs to roll the file back to the date 
specified. (i.e. xyz) It then re-calculates the SHA-1 hash for the re-created 
file. It then checks that newly calculated SHA-1 hash with the SHA-1 hash it 
stored for this file when it was backed up back on the date that corresponds 
with "xyz."

If any part of the process fails or the SHA-1 hashes dont' match, rdiff-backup 
will exit with a non-zero result. (And it should generate errors to the 
console...) 

If meta-data is damaged, and it can't figure out how to apply the rdiffs, you 
should get an error message.
If after rolling the file backward to date xyz, the check-sums don't match, 
you'll get an error.

Thus, to test the integrity of every piece of the system, pick a date for "xyz" 
that is at least as old as the oldest rdiff session. This should, by 
requirement, apply every reverse diff in the repository and all the meta-data.

While a successful results of a "--verify-at-time xyz" isn't sufficient to 
ensure that someone hasn't tampered with the rdiff-repository in an attempt, 
for example, to modify executable files - it is very strong evidence that 
chance or bad-luck hasn't damaged the system. Random collisions for the same 
file in the SHA-1 checksum are vanishingly small. (i.e. Two very similar files 
having the same SHA-1 checksum but not being equal, by simple chance (not 
malicious design), is exceedingly unlikely.)








reply via email to

[Prev in Thread] Current Thread [Next in Thread]