rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] How much metadata to store


From: dean gaudet
Subject: Re: [rdiff-backup-users] How much metadata to store
Date: Mon, 2 Dec 2002 15:08:39 -0800 (PST)

On Mon, 2 Dec 2002, Ben Escoto wrote:

> Is anyone suffering because stat() operations on the mirror side are
> taking too long?

i noticed improved performance by enabling noatime,nodiratime in the mount
options for the mirror fs... but this was ages ago with 0.6.x or 0.7.x i
forget which.  these options eliminate disk writes to update the atimes on
files/directories which are accessed -- and directories are considered
accessed by opendir().

i suspect that the real benefit is in not having to traverse the mirror
filesystem to get the filelist...

and even better would be if you could avoid recalculating all the
signatures and retransmitting them.  it seems like you could keep a copy
of the mirror metadata on the mirror and the primary, and use a signature
comparison of the two at the beginning of the backup to speed up the file
selection.  this would help a mirror scale to hundreds of primaries (i
suspect that the code today won't scale because the mirror has to parse
all of its files for every primary it has a mirror of).


>     About the compressed mirror idea, this doesn't seem to necessarily
> need extra metadata.  If we assumed a file whose mtime was the same
> hasn't changed, we could do this without extra metadata, just by
> setting the mtime of the compressed mirror copy to the original file's
> mtime.  But saving other information does let us compare file size.

fyi -- future filesystems will have nanosecond resolution mtime -- and
will have better guarantees that mtime will change every time there is a
committed file modification.  logging filesystems may already have this
guarantee, i don't want to make any claims about which do and don't
though.

it'd be pretty cool to do a filesystem extension which allows you to store
an md5/sha1 of the file as an extended attribute which is removed whenever
the file is modified :)


>     Also when we start compressing the mirror files, it seems to take
> us away from the whole mirror concept.  Why not then just use
> something like duplicity, which already compresses everything

it sure is convenient to have all the files available in the mirror and to
push the compression/packing problems onto the filesystem.  (*)

i haven't switched to duplicity because i find i need to peruse files in
my mirror frequently enough that duplicity would be too much extra effort.

-dean

(*) i'd even extend this to encryption.  but i'm not sure there are any
really secure encrypted filesystems on free unix yet... on linux, using
the encrypted loopback mount is not secure for large filesystem because
such a filesystem has a vast amount of predictable data (consider that a
typical linux install has about ~1GB of exe/lib/etc. data which is easy to
predict), which allows "known-plaintext" attacks against the cipher.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]