duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] best practices for handling hard links


From: Marc Evans
Subject: Re: [Duplicity-talk] best practices for handling hard links
Date: Sun, 23 Nov 2014 16:18:15 -0500
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Thunderbird/31.2.0

After some digging around I found the following discussion of hardlinks:

http://wiki.zimbra.com/wiki/Hard_links

The script as written was almost useful to me, though it used what
appears to be a linux-centric form of the find command. I therefore made
a change that allowed it to work on FreeBSD but also added a new mode
which generates a duplicity oriented exclude list. The modified
hardlinks script is attached here. Using this I am able to create
duplicity maintained backups that when restored have files hard linked
as they did at the moment in time that the backup was performed. (yes
there is a race condition between when the hardlink list is generated
and when the duplicity run is complete, which is not addressed currently.)

I have also found two other scripts useful in managing backups via cron.
Attached are gs-backup.sh which I invoke daily from within cron and also
gs-status.sh which is optionally invoked at the end of gs-backup.sh
runs. The gs-backup.sh script makes some assumptions about the layout of
what you are backing up, allowing for some parallel work to take place
when multiple cron invocations occur at the same time. I share them here
in case others may find them useful, if nothing else as a template for
your own scripts. As written they are google-cloud oriented, but could
be modified for other storage forms.

- Marc

On 11/17/14 7:12 PM, Marc Evans wrote:
> Yeah, I did find this thread:
> 
> http://lists.nongnu.org/archive/html/duplicity-talk/2008-08/msg00044.html
> 
> but alas the zip file it refers to is no longer there, and the wayback
> machine does not seem to have it either. Any pack rats out there, or
> should I create my own?
> 
> It does surprise me that this was a know limitation back in 2008 and it
> still may exist in the code base today.
> 
> - Marc
> 
> On 11/17/14 1:43 PM, Kenneth Loafman wrote:
>> A while back someone on the mail list wrote a script to help with this. 
>> Some searching may be needed. 
>>
>> ...Ken
>>
>> On Nov 17, 2014 6:12 AM, "Marc Evans" <address@hidden
>> <mailto:address@hidden>> wrote:
>>
>>     Hello,
>>
>>     I am trying to determine what the current state of hard link handling is
>>     for duplicity, and then assuming that my belief is correct, that hard
>>     links are not preserved by duplicity, I would like to understand what
>>     current best practices are?
>>
>>     Background: I have about 26TB of raw data that I am backing up to the
>>     cloud via duplicity. Once it is encrypted, etc, it consumes about 48TB
>>     in the cloud, which includes 1 full backup plus daily incrementals
>>     spanning a 1 month period. With the data is considerable files that are
>>     highly compressed as well as thousands of hard-link files.
>>     Experimentation is finding that the hard-link files are getting stored
>>     multiple times, and further when restored the hard links are not
>>     preserved.
>>
>>     Based on my reading of the mailing list archives my observations seem to
>>     be confirmed, though that is in years-old threads. I see in the code
>>     various pieces that are hard link oriented though. I also see discussion
>>     of special casing hard link handling at duplicity invocation, such that
>>     excludes are used to insure that only one copy is actually backed up and
>>     a hard link manifest is generated that can be used by scripts to restore
>>     hard links.
>>
>>     Given the above, what is the state of hard link handling and what are
>>     current best practices for dealing with them?
>>
>>     Thanks in advance - Marc
>>
>>     _______________________________________________
>>     Duplicity-talk mailing list
>>     address@hidden <mailto:address@hidden>
>>     https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>
>>
>>
>> _______________________________________________
>> Duplicity-talk mailing list
>> address@hidden
>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>
> 
> _______________________________________________
> Duplicity-talk mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
> 

Attachment: hardlinks
Description: Text document

Attachment: gs-backup.sh
Description: Bourne shell script

Attachment: gs-status.sh
Description: Bourne shell script


reply via email to

[Prev in Thread] Current Thread [Next in Thread]