rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Proposal to fix long filenames


From: Ben Escoto
Subject: Re: [rdiff-backup-users] Proposal to fix long filenames
Date: Fri, 18 Nov 2005 23:40:45 -0600

>>>>> dean gaudet <address@hidden>
>>>>> wrote the following on Sun, 13 Nov 2005 17:48:11 -0800 (PST)
> you know i'm not so sure the goals conflict...
> 
> even if you wanted to do something like FUSE as you suggest you're
> going to need to generate the already-patched blocks of older files
> ... so you'd probably end up keeping a cache around at the FUSE
> level.
> 
> the only real optimisation i can think of is to store all the deltas
> for a particular object together -- so that you only need to go to
> one place to rebuild whatever ancestor you're interested in.  but
> realistically even if you concatenate them together the filesystem
> isn't generally going to be able to avoid fragmentation...

I thought about this in some detail when I was thinking about how to
redesign duplicity.  (Unfortunately I never ended up having enough
time to actually do that.)

You mention some optimizations we could use to remove some seeks, but
it seems a bigger problem is that, as it is, random access doesn't
even have the right order of complexity.

For instance, listing the directory as it is requires that the entire
mirror_metadata file (which can be huge) be decompressed and parsed.
Reading the last block from an older compressed file may require that
all of all of the snapshots be decompressed.

Caching could help a bit with this, but I think it's a lost cause when
basic operations that applications think will take constant time
actually take linear time.

This being said, someone could take the current rdiff-backup format
and put together a FUSE interface that would be useful for many
applications.  I would just have the FUSE mount take the
mirror_metadata file and rebuild it with traditional directory-inode
pointers, and cache file blocks as you said.

> - my largest backup is throwing away at least 0.6GB disk space just
>   for the tail fragments on all the rdiff-backup-data inodes
>   ... this is for 28 days of increments on a 1.5M inode fs -- there
>   are an additional 0.5M inodes in rdiff-backup-data, of which 0.3M
>   have a non-zero size, and so on avg waste 2048 bytes (4KiB
>   blocks).

Yeah, rdiff-backup uses a lot of small files.  Eventually though most
filesystems should get tail-packing like reiserfs, so it won't be as
big of an issue.


-- 
Ben Escoto

Attachment: pgpGJxJQ7DR9D.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]