[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Duplicity-talk] Recreates and uploads already uploaded volume upon
From: |
Roman Yepishev |
Subject: |
Re: [Duplicity-talk] Recreates and uploads already uploaded volume upon restart? |
Date: |
Sun, 17 May 2015 09:15:47 -0400 |
Aaand I found exactly why this is happening:
https://bugs.launchpad.net/deja-dup/+bug/487720
Restore fails with "Invalid data - SHA1 hash mismatch"
From bin/duplicity:
"""
# We start one volume back in case we weren't able to finish writing
# the most recent block. Actually checking if we did (via hash) would
# involve downloading the block. Easier to just redo one block.
self.start_vol = max(len(last_backup) - 1, 0)
"""
Since MediaFire upload operations are atomic (the files are either
uploaded completely or not), this introduces a sizable overhead when
volumes are 500+ MiB.
So at this point I am wondering: should the backends be required to
implement atomic uploads instead of duplicity core working around the
issues with backends?
On Sat, 2015-05-16 at 09:20 -0400, Roman Yepishev wrote:
> Hello,
>
> Using duplicity 0.6.25 on Fedora 21.
>
> I am currently working on MediaFire backend[0] and I found that when
> backup fails to upload, duplicity re-creates and re-uploads an already
> uploaded volume. For large volumes this increases the backup time. Since
> the new archive is not exactly the same as the one already uploaded (I
> suspect file order/metadata differs) it does not match the checksum on
> the server, so the server can't just tell the client "yeah, I already
> have that, no need to reupload".
>
> More logs are available at [1], where I reproduced the same issue with
> file backend, but the interesting lines are:
>
> ---------------------------------------------------------------------
>
> Writing
> /home/rye/tmp/duplicity/1/duplicity-full.20150516T130024Z.vol1.difftar.gpg
> AsyncScheduler: task completed successfully
> Processed volume 1
> ...
> AsyncScheduler: running task synchronously (asynchronicity disabled)
> Writing
> /home/rye/tmp/duplicity/1/duplicity-full.20150516T130024Z.vol2.difftar.gpg
> Processed volume 2
> ----------------------------------------------------------------------
>
> I press CTRL+C - simulating backend failure to upload, as it sometimes
> happens with MediaFire.
>
> ----------------------------------------------------------------------
> ...
> ^CReleasing lockfile <lockfile.linklockfile.LinkLockFile instance at
> 0x7f261c9e25f0>
> INT intercepted...exiting.
>
> -----------------------------------------------------------------------
>
> So I see vol2 in the temporary folder and I expect subsequent upload to
> start from vol3, however:
>
> ------------------------------------------------------------------
>
> Found primary backup chain with matching signature chain:
> -------------------------
> Chain start time: Sat May 16 09:00:24 2015
> Chain end time: Sat May 16 09:00:24 2015
> Number of contained backup sets: 1
> Total number of contained volumes: 2
> Type of backup set: Time: Num volumes:
> Full Sat May 16 09:00:24 2015 2
> -------------------------
> No orphaned or incomplete backup sets found.
> RESTART: Volumes 2 to 2 failed to upload before termination.
> Restarting backup at volume 2.
> ...
>
> Restarting after volume 1, file 2006/01/01/100_3273.JPG, block 7
> ...
> Writing
> /home/rye/tmp/duplicity/1/duplicity-full.20150516T130024Z.vol2.difftar.gpg
>
> ------------------------------------------------------------------
>
> I was not able to find a relevant bug report in Launchpad, and the
> mailing list, so I'd like to find out whether this is an expected
> behavior and why it is so?
>
>
> [0]: https://github.com/roman-yepishev/duplicity-mediafire/
> [1]: http://paste.ubuntu.com/11166057/
> _______________________________________________
> Duplicity-talk mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
signature.asc
Description: This is a digitally signed message part