rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] rdiff-backup features?


From: Andrew Ferguson
Subject: Re: [rdiff-backup-users] rdiff-backup features?
Date: Mon, 09 Jul 2007 23:23:01 -0400
User-agent: Thunderbird 1.5.0.12 (Macintosh/20070509)

address@hidden wrote:
> I am imagining the following process, please let me know your thoughts:
> 
> Step A, Backup Sync:
>    1. Write changes as diffs against the current repository into 
> something like $REPO/rdiff-backup-data/scratch/.
>    2. Simultaneously, write changes as rdiffs against the will-be new 
> version for placement in $REPO/rdiff-backup-data/increments/.

What do you do when the link goes down in the middle of Step A? The
file's you've already written scratch changes for could have changed, so
you'll have to recompare anyway.

> Step B, Commit of Sync; may be done offline (CRITICAL SECTION):
>    1. Patch all of the current $REPO against rdiff-backup/data/scratch/ 
> diffs
>    2. Move rdiff increments in part A2 into their proper location
>    3. Update metadata upon completion to reflect changes.
> 
> During step B, restores can not be performed since live data and metadata 
> are being changed.  Some amount of locking should be performed here to 
> keep restores (and backups!) from happing during this step.
> 
> It might also be a good idea to use rdiff-backup's existing rollback 
> methods (--check-destination-dir) to fix the repository if a Python 
> exception is thrown somewhere in the middle of step B (ie: out of disk 
> space), in order to roll-back and try step B again.  Of course if step B 
> continues to fail, one might blow away the scratch directory and resync. 
> This would certainly be better than blowing away the entire increment 
> tree!

You'd still have to get the mirror sync'd again if you blew away the
scratch directory because it was failing at some point.

I think the cost of increased book-keeping outweighs the benefit.

The current system is fairly simple and works well. It's chief failing
is that if you transfer some big new files and the backup fails, the big
new files are lost. Are there other failings that cost bandwidth
(besides move tracking)?

We can't get around the issue of restarting the file compare --
fortunately the rdiff algorithm makes that fast. (My use case is 20.4 GB
compared in 17 minutes, 26.45 seconds or 19.96 MB/s, much higher than my
link speed)

Maybe the verdict is we should keep NEW files in a 'scratch area' in
case the link goes down? Then, in the backup after a failed backup,
rdiff-backup could inspect the scratch area for already transferred new
files? Those could then be the basis for a diff transfer.

Andrew

-- 
Andrew Ferguson - address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]