rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Some thoughts and questions for a "large" rdiff


From: Ben Escoto
Subject: Re: [rdiff-backup-users] Some thoughts and questions for a "large" rdiff-backup setup.
Date: Sat, 16 Aug 2003 01:41:47 -0700

>>>>> "EF" == Erik Forsberg <address@hidden>
>>>>> wrote the following on Thu, 14 Aug 2003 19:49:49 +0200

  EF> Hi!  I'm thinking about using rdiff-backup to implement a backup
  EF> solution for the Academic Computer Society I'm a member
  EF> of. We're talking 500+ members, each with their own home
  EF> directory, a mailserver with mailspool, web server with webdisk
  EF> and a bunch of other disks that needs backup. All members have
  EF> Unix shell access on a variety of Hardware/Operating system
  EF> combinations (Solaris, Linux, AIX, HP-UX, Tru64, UNICOS, you
  EF> name it :)  ).
    ...
  EF> My workaround here could be to run rdiff-backup once for each
  EF> user, backing to a directory where only the user can
  EF> read. Comments on this?  I have to admit I'm not *that* happy
  EF> about having to run 500+ rdiff-backups each night, each as it's
  EF> own user. Trying to parallelize it could be a nightmare. Better
  EF> ideas?

I have no experience running a system like this, but since no one else
has responded I'll add my 2c.  It seems to be a good idea to run a
separate session for each user.  That way older increments can be
removed on a user-by-user basis.  Also if there is a problem (for
instance rdiff-backup had trouble handling sockets on one of your
systems apparently) it will only affect one user.

If rdiff-backup takes 2 seconds overhead to start a new session that
would still only be about 15 minutes for 500 sessions.  It probably
wouldn't be a good idea to run all 500 at the same time...  Running
two or three at once may be faster than running one at a time though,
it probably depends on the system.

Also you may not want to let the users write to the rdiff-backup
directory, since they might mess it up.

  EF> On a different side, wouldn't it be nice if rdiff-backup could
  EF> auto-clean it's destination directory when it's getting
  EF> full. That is, it would be nice if you could specify to
  EF> rdiff-backup that it should use up to a specific amount of
  EF> diskspace in the destination directory, and if a new backup
  EF> doesn't fit in, it should try cleaning one day of backups.

  EF> I guess this isn't that easy to implement, especially since you
  EF> don't know how much space a new backup will occupy. Ideas on
  EF> solving this problem?

Yes, if anyone has any ideas about this problem let me know.  Right
now the only way to run rdiff-backup that's really convenient is to
specify, for instance, --remove-older-than 30D (for 30 days) every
once in a while.  There is no option to remove just the number of
increments that would let you complete the current backup (this would
actually be very hard to add).

Another thing to watch out for is running out of space.  rdiff-backup
is supposed to fail gracefully (as in, if it runs out of space it acts
as if the current session never existed), but subsequent sessions will
also fail.  I'm not sure what the best way to handle this is, but it
is probably a good idea to check rdiff-backup's exit code and trigger
some warning if it fails to complete (non-zero code).

  EF> Another feature that would be really nice on this system would
  EF> be to have .nsr-lookalike-files. For those of you not familiar
  EF> with Networker, .nsr files are a way to specify how to handle a
  EF> specific file, a specific directory or a specific directory and
  EF> all its subdirectories. By putting a file named .nsr in a
  EF> directory, you can for example say that the subdirectory 'trash'
  EF> never should be copied to the backup, or that a specific logfile
  EF> should be ignored. Are there any plans for such functionality in
  EF> rdiff-backup? How hard would it be to implement? (We might be
  EF> able to help, we love Python :)  ).

You could write a script to check for $HOME/.rdiff-backup-excludes.
If it exists, rdiff-backup is run with --exclude-filelist
$HOME/.rdiff-backup-excludes.  This is actually pretty flexible since
includes can also be in an exclude file (see man page).

  EF> Another idea I had for this setup was to set a quota on the
  EF> amount of backup space a specific user could use, and then let
  EF> the user him/herself choose how many days he/she wants the
  EF> backup to cover by adapting what directories should be copied
  EF> (using a mechanism such as the one above, the
  EF> .nsr-file-lookalike one).

Yes, probably a good idea, make the user decide/hangself...


-- 
Ben Escoto

Attachment: pgpAcKN8R297u.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]