duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Fwd: AssertionError on every attempt


From: Rupert Levene
Subject: Re: [Duplicity-talk] Fwd: AssertionError on every attempt
Date: Tue, 9 Jun 2015 17:54:11 +0100

On 9 June 2015 at 16:56,  <address@hidden> wrote:
> still smells hackish to me. an exception for the deletion can have many 
> causes and doesn't guarantee that there are no more other instances of that 
> file on the backend.

Agreed, using an exception isn't great. Maybe the following would be
better at the top of _put?

        while self.id_by_name(remote_filename)!='':
            self._delete(remote_filename)

> an approach like
> 1. list before upload
> 2. if one or more instances already, delete all
> 3. list again and raise error if there are still instances
> would be more costly but also more secure. no?

Sounds good. Your other email about querying on a specific filename
should greatly reduce the extra cost. FilesList calls can be very very
slow, with each call taking several minutes if there are lots of files
in the drive backup folder. Drive seems to throttle such requests to
something like 60 KB/s for me, whereas straight file transfers run at
more than 5MB/s. So changing id_by_name and _query to avoid a full
FilesList call would be very useful.

BTW, FilesList can be made somewhat quicker (and use considerably less
memory) by restricting the fields requested:

            ret = self.drive.ListFile({'q': "'" + self.folder + "' in
parents", 'fields': 'items(title,id,fileSize),nextPageToken'
}).GetList()

> reading
>  
> http://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/backends/pydrivebackend.py
> it looks like creating new file instances with the same name is possible by 
> design. reading here
>  
> http://pythonhosted.org/PyDrive/filemanagement.html#upload-and-update-file-content
> suggests that "overwriting" a file would be retrieving the existing file and 
> SetContentFile() on the object again eg. something like
>
> """ overwrite a possibly existing failed upload or create a new file """
> id = self.id_by_name(remote_filename)
> if id:
>   drive_file = self.drive.CreateFile({'id': id})
> else:
>   drive_file = self.drive.CreateFile({'title': remote_filename, 'parents': 
> [{"kind": "drive#fileLink", "id": self.folder}]})
> drive_file.SetContentFile(source_path.name)
> drive_file.Upload()

This looks like a good idea, in conjunction with the approach above:
if there are files with the same filename, first delete all but one
and then update the unique file remaining; otherwise upload a new
file. As an added bonus, I imagine drive would keep revision history
for any updated files.

Rupert

>
> ..ede/duply.net
>
> On 09.06.2015 17:22, Rupert Levene wrote:
>> _delete removes one file at a time, but there can be any number of
>> files with the same name and we need to remove them all.
>>
>> The loop will terminate when all the files are deleted since an
>> exception will be raised in id_by_name.
>>
>> Rupert
>>
>> On 9 June 2015 at 16:13,  <address@hidden> wrote:
>>> why the endless loop? ..ede/duply.net
>>>
>>>
>>> On 09.06.2015 17:09, Rupert Levene wrote:
>>>> How about this?
>>>>
>>>> === modified file 'duplicity/backends/pydrivebackend.py'
>>>> --- duplicity/backends/pydrivebackend.py    2015-05-31 19:14:43 +0000
>>>> +++ duplicity/backends/pydrivebackend.py    2015-06-09 14:40:37 +0000
>>>> @@ -84,6 +84,12 @@
>>>>              return ''
>>>>
>>>>      def _put(self, source_path, remote_filename):
>>>> +        # delete files with same filename to avoid duplicates
>>>> +        while True:
>>>> +            try:
>>>> +                self._delete(remote_filename)
>>>> +            except:
>>>> +                break
>>>>          drive_file = self.drive.CreateFile({'title': remote_filename,
>>>> 'parents': [{"kind": "drive#fileLink", "id": self.folder}]})
>>>>          drive_file.SetContentFile(source_path.name)
>>>>          drive_file.Upload()
>>>>
>>>>
>>>> On 9 June 2015 at 09:49,  <address@hidden> wrote:
>>>>> On 09.06.2015 10:46, Rupert Levene wrote:
>>>>>> Maybe this could be fixed by asking the server to delete the original
>>>>>> upload (since duplicity believes it to be faulty) before reuploading?
>>>>>
>>>>> +1 ..ede/duply.net
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Duplicity-talk mailing list
>>>>> address@hidden
>>>>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>>>
>>>> _______________________________________________
>>>> Duplicity-talk mailing list
>>>> address@hidden
>>>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>>>
>>>
>>> _______________________________________________
>>> Duplicity-talk mailing list
>>> address@hidden
>>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>
>> _______________________________________________
>> Duplicity-talk mailing list
>> address@hidden
>> https://lists.nongnu.org/mailman/listinfo/duplicity-talk
>>
>
> _______________________________________________
> Duplicity-talk mailing list
> address@hidden
> https://lists.nongnu.org/mailman/listinfo/duplicity-talk



reply via email to

[Prev in Thread] Current Thread [Next in Thread]