rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] What are the bottlenecks in --verify? (Or how t


From: Dominic Raferd
Subject: Re: [rdiff-backup-users] What are the bottlenecks in --verify? (Or how to speed up verification?)
Date: Thu, 03 Oct 2013 14:52:25 +0100
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.0

There have been discussions about verification speeds and issues here before. I think you are right that the CPU core is a bottleneck as rdiff-backup only uses one core when running a verification.

Verifications do use temporary space and can use a lot of it even though the temporary files never seem to be visible in the filesystem. I would advise an explicit --tempdir setting, and a different spindle with lots of space (and speed) would be ideal.

You might find it helpful to see or use my timedicer-verify.sh script - http://www.timedicer.co.uk/programs/help/timedicer-verify.sh.php. This is a wrapper for rdiff-backup --verify-at-time, can remember previous successful verifications, and runs multiple concurrent verifications (thus using multiple cores efficiently). It allows you to specify a temporary location (passed to rdiff-backup as --tempdir). It can make and use a temporary LVM snapshot as source - this allows you to continue a verification session while updating the underlying repository, but can only work if your repository/ies are on a logical volume (/root or /home as currently written). And as currently written it assumes that the repository/ies are all located at /home/*/[here] or /home/*/*/[here].

Dominic
--
TimeDicer: Free File Recovery from Whenever

On 26/09/2013 04:12, Thomas Harold wrote:
What are some options for speeding up the verification of past increments?

My guess is that the CPU might be a bottleneck for the SHA1 hash 
calculation, so that's something I would check first.

But how does the verify process work?  Does it reconstruct the file in 
memory, or does it use a temporary directory?

...

It seems like if I don't have TMP, TMPDIR, or TEMP defined as 
environment variables, it is operating system dependent on where Python 
creates the temporary file.  Or unless I pass the --tempdir option to 
rdiff-backup.

http://docs.python.org/2/library/tempfile.html

And if the --verify or --verify-at-time options create lots of temporary 
files, and write/read lots of data to the temporary directory, then I 
should probably move that directory to a separate set of spindles.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]