[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] adding --resume back

From: Marco Mariani
Subject: [rdiff-backup-users] adding --resume back
Date: Wed, 03 Sep 2014 17:37:19 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.0


I'm looking at possible ways to add the resume feature to backups (either initial or incremental, ideally both) that were interrupted due to an unreliable network.

AFAIK the feature was added in 2002, with version 0.5.0, and removed 10 months later,
in version 0.11.1, with the CHANGELOG comment

    All "resume" related functionality, like --checkpoint-interval:
    This was complicated to implement, and didn't seem to work all
    that well.

Later, the same feature has been requested a few times, and denied:


> Not that I know of. Because of the complexity of the underlying archive,
> rdiff-backup does not like a failed or interrupted previous backup attempt at > all and tries to remove one if it finds it. Otherwise the risk would be that
> you corrupt the archive and lose your data history. Although it might be
> possible in theory for rdiff-backup to continue a previously-interrupted
> session, the code to do this doesn't exist and isn't likely to be written.

A more extensive, though speculative, rationale is given in


> As I see it, the problem is that rdiff-backup saves increment files as it > goes along updating the remote repository. It does this in such a way that > it can undo the increments if necessary, with --check-destination-dir, but
> I think it might not be able (currently) to:
> * determine which increments have already been applied when restarting the
> backup, and not apply them again; and
> * handle the case where a file that was incremented during the last run
> has subsequently changed and needs to be incremented again (merging
> increments); and
> * handle the case where the increments created so far do not match the log
> file written so far (because the two cannot be updated atomically in
> step).

Now I can add some constraints, and avoid content changes between a failed
backup and a resume.
If I take care of 1) and 3) and don't care about increment merging, does the idea of saving snapshots
and reloading with an explicit --resume option become viable?
Has anybody attempted to do that since february 17, 2002?

A different approach has also been proposed:


> Here is what I propose: when regressing a repository prior to a backup,
> rdiff-backup takes all "new" files (files that have been added during the
> failed backup) and moves them to a temporary location inside of the
> rdiff-backup-data folder. Then, when the backup runs, if it encounters a new > file, it first checks to see if the file exists in this temporary location, and > if it does, it diffs against that file (or moves it to the target location, > then diffs; I don't know which would be easier). At the end of any backup run,
> rdiff-backup empties out this folder.
> Thoughts/reactions?

Reactions were basically "use rsync + rdiff-backup", but it's not an option for me, unless there is a way to avoid doubling the required disk space. Hard links won't work.

The second proposal makes sense to me, and seems easier to implement than the checkpoints.
Am I missing something obvious?

I am open to evaluate other backup solutions, but I have some non negotiable requirements:

 - must support both pull and push
 - must efficiently store big files with binary delta
 - must be open source
 - must work on unreliable networks

This leaves out 99% of the alternatives, and I am willing to implement the last
point for rdiff-backup. Suggestions are very appreciated.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]