|Subject:||Re: [Duplicity-talk] What is the process for creating signature files ? [DOC-2]|
|Date:||Fri, 06 Mar 2009 19:00:08 +0100|
|User-agent:||Thunderbird 184.108.40.206 (Windows/20081209)|
Cyril Russo a écrit :
Organization of a backup archive (TAR format)
The backup archive are (currently) using the well know Gnu's TAR format.
When the files are scanned on the filesystem for backing up (using Rsync algorihtm for computing the smallest difference distance), they are cut in smaller part or blocks,
that are then saved in the backup archive. The current processing on the file (encrypting / diffing / comparing) will be explained in better detailed in the next part.
The block to be stored are either coming from file (in that case we name them fileblock) or from signature (in that case, we name them sigblock)
The current work of reading the block from an existign tar archive is done by the file diffdir.py
This files declares the following objects:
DirSig (used in rdiffdir)
A simple class used to iterate the sigblock.
DirFull, DirFull_WriteSig (used in rdiffdir and duplicity main)
A simple class to store the files' content in tar blocks
Because it's easier to have common code used everywhere, the process compute the difference from the files found, and a virtual empty file (producing a difference equal to the file itself). A similar process is used when the files already exists, the virtual empty file becomes the previous version's file.
The WriteSig version also compute the signature and write it to the given output file pointer
DirDelta (used in rdiffdir and duplicity main, it's the default implemation of DirFull)
This is the actual code computing the difference between the given path's files and the given reference (either nothing, or a previous backup archive).
The process compute both the file's content difference, and the file's information difference (has a file been added, deleted, unmodified or modified ?).
The file's content goes to the backup archive, while the file's information goes to the signatures.
The name says it all. It keeps track of the amount read.
A read only file class that computes the signature (from rsync algorithm) while it's being read.
The computed signature for each block produce a simple code (depending on the block state: added, modified, deleted etc...)
TarBlockIter (abstract, private)
This class use a given (file) iterator on input, and matching the matching tar'ed block of the given size while iterating.
The behviour depend on the following child classes:
Doesn't read the file, but instead count the files passed in.
This one returns the tar block from a signature's archive file
This one returns the tar block for the files archive.
That's all for this email, again, please spot the errors.
This one doesn't explain anything about splitting the signature files, but, I hope, makes the understanding of the backup process clearer.
I'll continue with explaining the backup algorithm in the next email (if I understand it correclty).
For now and what I've understood, we could hack the Collection stuff to actually parse file with both "signature.gpg" and "sig000.gpg" as a valid signature files, and in the later case, start returning the signature archive collection. I still haven't found how to split the signatures during creation, but I hope it'll appear in the next email.
_______________________________________________ Duplicity-talk mailing list address@hidden http://lists.nongnu.org/mailman/listinfo/duplicity-talk
|[Prev in Thread]||Current Thread||[Next in Thread]|