[rdiff-backup-users] Regression optimizations?

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] Regression optimizations?

From:	Nathan Lewis
Subject:	[rdiff-backup-users] Regression optimizations?
Date:	Wed, 26 Oct 2011 01:20:27 -0500
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.23) Gecko/20110920 Thunderbird/3.1.15

I've been playing with and studying rdiff-backup for about a week andfor the most part it works well for our scenario - keep a backup mirrorthat is easy for anyone to access with incrementals if necessary.

However, it has the rather unfortunate property that when an incrementalfails, the next run proceeds to do a regression on the entire mirror. Iunderstand that this is necessary to get the mirror back into aconsistent state, but it seems like it could be optimized. Logically,if an incremental fails, 99.999% of the files will still be perfectlyfine because the failed incremental didn't touch them in the firstplace. So why does a regression need to touch every file? Can't aregression look at which files have incrementals that need to be deletedand only regress those files? It seems to spend most of its time in thefollowing loop:


1.  Copy the file in question to a .tmp file
2.  Apply attributes/ACLs to the .tmp file
3.  Rename the .tmp file back to the original file.

When there's 400k files in a backup, this actually takes longer than afull backup would. Surely I'm missing some scenario where this isnecessary? Couldn't this (extremely common) scenario be detected andjust apply the attributes/ACLs to the original file from the mirrormetadata? Why is the .tmp file necessary?

This brings up another related question - the attributes are stored in aseparate file in the rdiff-backup-data directory, do they really need tobe applied to the mirror? I understand rdiff-backup is trying to makethe mirror match the original as closely as possible but due tofilesystem differences the mirror attributes can't really be trustedanyway. I would actually like to override the mirror's attributes andmake them read-only so the mirror can't be messed with, or simply tellrdiff-backup not to bother setting attributes on the mirror's files(particularly when regressing.)

I'm not afraid to go poking around in the source and try to make somechanges but I'd like to discuss any side effects or pitfalls first.


--Nathan

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Regression optimizations?, Nathan Lewis <=

Prev by Date: Re: [rdiff-backup-users] Changing ownership of files in archive
Next by Date: [rdiff-backup-users] Disaple SSH encryption in local networks?
Previous by thread: [rdiff-backup-users] Changing ownership of files in archive
Next by thread: [rdiff-backup-users] Disaple SSH encryption in local networks?
Index(es):
- Date
- Thread