[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Maintenance.

From: Alvin Starr
Subject: Re: [rdiff-backup-users] Maintenance.
Date: Fri, 06 Dec 2013 08:14:12 -0500
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0

On 12/05/2013 11:04 PM, Edward Ned Harvey (rdiff-backup) wrote:
From: rdiff-backup-users-bounces+rdiff-
address@hidden [mailto:rdiff-backup-users-
address@hidden On Behalf Of Alvin

Clearly there are hundreds of better ways to back up a sparse file.

The point is that I sort of expected rdiff-backup to be as smart as tar
and rsync in that perspective.
I certainly haven't had any good experiences backing up (or even copying) 
sparse files with tar.  Yes I've done it, but by default it's not supported 
(unless you add the switch) and even with that switch, I wouldn't call it a 
good experience.  No matter how you cut it, you have to read the entire sparse 
file (including empty space) the question is whether or not sparseness is 
preserved on the destination.  There is unfortunately no such thing as a flag 
or an attribute you can check on a file to see if it's sparse; your only choice 
is to simply read every file, and optionally apply sparseness to a destination. 
 But since you have no good way to know if the source is sparse, you just 
unconditionally make every file on the destination sparse.
Well tar was not a stellar example but it does handle sparse files.
If you do a stat on the file you get back the file size and number of blocks/blocksize. You can then multiply the number of blocks by the blocksize and if it is significantly less than the file size you have a sparse file.

Its true that reading a sparse file gets you lots of zero buffers and compression will make them really small. Since the OS does not actually read the disk zero filled buffers are fast to read.

Always making the destination files sparse may not be a bad idea.
When it comes to doing the disk write rdiff could just not write a zero block and just seek to the next place to write.
This would only be true on fire creation though.

The side effect would be that a restore will be smaller than the original data in just about every case where there are files with zeros. This could be a problem when someone really does require the file to be pre-allocated at file creation time.

For large sparse files, as suggested, it's much better to backup with a tool 
that recognizes the internal contents of the file.  Something which can read 
the structure, and only copy out the useful parts.  Not to mention, if it's a 
database file, it's important to ensure data integrity.  You don't want to be 
reading byte # 178,343,543,344 with 877,344,563,233 to go, and some other 
process writes to the file, thus making all of your work so far invalid.

Or use compression.  Cuz guess what, a large sequence of 0's is highly 
compressible.   ;-)
A system backup always runs the risk of suffering from dynamic file change.
Someone editing a text file across a backup runs the risk of getting crap in the file.
Which is a good chunk of the reason that backups are done at quiet hours.

In either case.
A backup is not guaranteed to return "application correct" data it is just guaranteed to return the data that was in that block on the disk when it was read and 99%+ of the time this is ok. In either case the backup should not break the target system in the event of normal file structure and sparse files are a normal feature of every unix OS I know of.

Alvin Starr                   ||   voice: (905)513-7688
Netvel Inc.                   ||   Cell:  (416)806-0133
address@hidden              ||

reply via email to

[Prev in Thread] Current Thread [Next in Thread]