[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] fuzzy match - moved/renamed

From: David
Subject: [rdiff-backup-users] fuzzy match - moved/renamed
Date: Sun, 7 Feb 2016 14:09:30 +0100
User-agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

Hi All,

Are there plans to implement fuzzy match or similar algorithms to match files moved/renamed files?

With scenario where large files are renamed or moved between folders rdiff-backup treats these files as new ones and as result transfers large amounts of data and takes a lot of data to store diffs, i.e. for 12 weeks or so, whilst the data is in fact the very same.

What I would like to suggest is:
in case of discovering new file, calculate checksum, check if the checksum exists already in destination folder (under any subfolder): a) if the file does exist in destination folder (different file name/path) and the file name/path does not exist anymore in the source, simply rename/move file b) if the file does exist in destination folder (different file name/path) and the file name/path _does_ exist in the source we have situation of duplicate of the file and can either do hardlinking or create local copy.

Above approach would solve the problem of transmitting and storing a lot of data for the same files being moved between folders.

The deletions should be done at the very end of the process as by that we could re-use files already store.

The diff between backups would then store only differences again and not full copies of the files.

Does this sound like something which could be implemented in the near future?

This might be not the best place to post this question, but if there is a better backup solution handling situations like this, please let me know too. I'm looking to keep all the goodies of rdiff-backup hence rsync with fuzzy option is not a way to go for me.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]