[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] Re: [librsync-users] more info on 25gig files

From: Ben Escoto
Subject: [rdiff-backup-users] Re: [librsync-users] more info on 25gig files
Date: Thu, 5 May 2005 20:59:37 -0700

>>>>> Donovan Baarda <address@hidden>
>>>>> wrote the following on Fri, 06 May 2005 12:22:16 +1000

> The block size will have a significant impact on speed, as it
> "walks" through the file faster when there are hits. I would
> recommend using a blocksize that is the square-root of the file
> size... ie
> 1M file, 1K blocksize
> 1G file, 32K blocksize
> 4G file, 64K blocksize
> 16G file, 128K blocksize
> 64G file, 256K blocksize
> (this also reduces the probablity of blocksum collisions, which
> though very unlikely, will cause corruption).

Ahh rdiff-backup chooses the blocksize to be approximately 1/2000th of
the length of the file, witha minimum of 512 bytes (see find_blocksize
in Rdiff.py).  So if the file is 25 gigs large, perhaps Clint could
try running rdiff with a blocksize of 13421568.

But I'm not sure how I ever came up with that formula, and probably
there was no sound reasoning behind it.  Should I switch to the
square-root thing (minimum blocksize 512, blocksize always multiple of
512?)?  I remember there being some discussion about this, but I
probably never updated rdiff-backup with the correct function.

> I _think_ rdiff-backup uses an extension module that hooks into
> librsync itself. It is possible the rdiff-backup extension module is
> not correctly compiled with 64 bit support. I suspect the whole
> python interpreter would need to be compiled with 64 bit support.

Yes, that is correct, see _librsyncmodule.c.  If you want to test
rdiff-backup's librsync stuff separately, you may want to check out
python-rdiff, which is a simple port of rdiff to rdiff-backup's
librsync extension module.  You can download it from the rdiff-backup
CVS at:


Note the blocksize there is fixed, so you'll have to edit the top to
test different sizes.  Also I haven't tested it recently, but it looks
like it should still work.

Ben Escoto

Attachment: pgpw7JCNdVeiP.pgp
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]