duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Incremental Failures


From: Jeff Singer
Subject: Re: [Duplicity-talk] Incremental Failures
Date: Sun, 29 Mar 2009 13:25:39 -0400

I think I'm going to explore some ideas I have to reduce problems like this next time I get a chance:
1) Using S3s built in md5 features (etags) to check for data integrity as uploading
3) maybe optionally storing a file listing the md5s of the backup gzips, then adding a mode to compare that to amazon's md5s to do a backup integrity check
3) Implementing different levels of backups like dump so that to minimize the chances of a failed incremental, you could do an incremental based on the last full backup once a week or something like that.

If I get the chance, I'll try to implement these, and in fact, I haven't looked at the code carefully enough to see if 1) is already happening.

On Fri, Mar 27, 2009 at 3:47 PM, Peter Schuller <address@hidden> wrote:
> weekend). I've read that after a certain number of incremental backups,
> something will get messed up, and they won't be any good anymore, so it's
> reccomended that we do full backups around once a month.

It's important to note that of course nothing is *designed* to break,
it's just that statistically you are going to be more likely to have a
problem the longer a time the chain of incremental backups
last. Reasons for breakage could include software bugs (although that
need not be a function of chain age), but could also be things like
bad memory or other hardware, or just bit rot on hard drives.

A one month recommendation sounds sensible, but it is not specifically
special in the sense that anything suddenly breaks after a month. I'm
not sure what the phrasing tends to be, but I would combine such
recommendations with having at least two full backups at all times
such that there is some level of redundancy.

While duplicity is great, it is worth considering that the format is
such that recovery can be difficult even if corruption happens to a
small degree (try introducing a random bit into a gzip file...).

> What do other users do in this situation? Is a full backup once every 6
> months and nightly incrementals enough?

One will never be able to provide definite answers to such questions
since it is a matter of probability, and the point at which it becomes
"enough" is a matter of policy.

>  What can we do to reduce the risk of
> corrupted incrementals?

* Test backups frequently for early detection of any systematic problems.

* Run the backup on high-quality hardware (e.g. prefer ECC memory
etc). Of course this is often a useless recommendation in most home
environments.

* Try to find a good remote storage which you may have high confidence
in. Difficult.

Perhaps more...

> Is it likely only to be a single file that is
> irrecoverable if there is a problem with the incrementals?

I'm not as read up on the underlying tar format as I would like, but I
would say that the chances of a problem being localized to a single
file is less than what you might hope in an ideal situation. How's
that for vague :)

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <address@hidden>'
Key retrieval: Send an E-Mail to address@hidden
E-Mail: address@hidden Web: http://www.scode.org


_______________________________________________
Duplicity-talk mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/duplicity-talk



reply via email to

[Prev in Thread] Current Thread [Next in Thread]