|Subject:||Re: [rdiff-backup-users] What are the bottlenecks in --verify? (Or how to speed up verification?)|
|Date:||Thu, 03 Oct 2013 14:52:25 +0100|
|User-agent:||Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.0|
There have been discussions about verification speeds and issues
here before. I think you are right that the CPU core is a bottleneck
as rdiff-backup only uses one core when running a verification.|
Verifications do use temporary space and can use a lot of it even though the temporary files never seem to be visible in the filesystem. I would advise an explicit --tempdir setting, and a different spindle with lots of space (and speed) would be ideal.
You might find it helpful to see or use my timedicer-verify.sh script - http://www.timedicer.co.uk/programs/help/timedicer-verify.sh.php. This is a wrapper for rdiff-backup --verify-at-time, can remember previous successful verifications, and runs multiple concurrent verifications (thus using multiple cores efficiently). It allows you to specify a temporary location (passed to rdiff-backup as --tempdir). It can make and use a temporary LVM snapshot as source - this allows you to continue a verification session while updating the underlying repository, but can only work if your repository/ies are on a logical volume (/root or /home as currently written). And as currently written it assumes that the repository/ies are all located at /home/*/[here] or /home/*/*/[here].
TimeDicer: Free File Recovery from Whenever
On 26/09/2013 04:12, Thomas Harold wrote:
What are some options for speeding up the verification of past increments? My guess is that the CPU might be a bottleneck for the SHA1 hash calculation, so that's something I would check first. But how does the verify process work? Does it reconstruct the file in memory, or does it use a temporary directory? ... It seems like if I don't have TMP, TMPDIR, or TEMP defined as environment variables, it is operating system dependent on where Python creates the temporary file. Or unless I pass the --tempdir option to rdiff-backup. http://docs.python.org/2/library/tempfile.html And if the --verify or --verify-at-time options create lots of temporary files, and write/read lots of data to the temporary directory, then I should probably move that directory to a separate set of spindles.
|[Prev in Thread]||Current Thread||[Next in Thread]|