[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Encrypted Diffs

From: AJ Weber
Subject: Re: [Duplicity-talk] Encrypted Diffs
Date: Sat, 16 Jan 2010 13:27:27 -0500

Right, there are a lot of details to how this works, but I'll tell you what I _think_ I know. ;)

- Rsync/rdiff uses a mechanism of hashing the file to not only detect if it has changed, but also to identify which bytes have changed. - Duplicity keeps a copy of this set of hashes for the files in both the target server (where you're storing your backups), AND locally on the client -- in a cache directory. The local copy is just to speed-up processing and not have to request the hash-sets everytime from the target server. But it _does_ check if the hash set is up-to-date as well.

Now one would have to presume that these hashes are created before encrypting the file (on the client, before it's sent as part of the tar/gzip archive) so that they can accurately be compared. (This I presume.)

Thus, the heavy-lifting for the incremental detection and byte-diffs is done by the proven, rsync/rdiff algorithm and the results of that are then encrypted and sent to the backup (target) server.

Again, I am making some educated presumptions here, but I would think I'm pretty close after having looked at the source, results, and cache.

It seems to work great, though I don't like how it sends new signatures (or something -- I have to check) to the target server when there are NO changes. This seems like a waste of disk space (though it's fairly small), bandwidth and time. Not to mention that it clutters the backup servers' directories with files that essentially don't have any value. But I have to assume they do...just none that I understand at this point. Ken might be able to illuminate us on that.

Good Luck,

----- Original Message ----- From: "Gabriel Ambuehl" <address@hidden>
To: "Discussion of the backup program duplicity" <address@hidden>
Sent: Saturday, January 16, 2010 12:59 PM
Subject: Re: [Duplicity-talk] Encrypted Diffs

On 16.1.10 Michael Orlitzky wrote:
So, incremental encrypted backups are supposed to be hard. I've searched
through all of the documentation I can find (short of digging into the
source), and haven't been able to locate a good description of how
Duplicity solves the problem. If possible, can someone explain the
sequence of events that takes place when one creates an incremental
encrypted backup?

In particular, the question I'm trying to answer is, "how do we
calculate the difference between two encrypted blobs?". Do GPG/tar
provide random access within an archive? Or does Duplicity just diff
whatever comes out of the stream? Etc.

As far as I understand: Duplicity downloads the encrypted rsync data stored on
the other side (that would usually be transmitted when using rsync but
generated on the fly by the rsync on the other end) and runs rsync algo against
it. Works surprisingy well.

Duplicity-talk mailing list

reply via email to

[Prev in Thread] Current Thread [Next in Thread]