rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Re[4]: too many lstat() syscalls, therefore too many IOPS


From: EricZolf
Subject: Re: Re[4]: too many lstat() syscalls, therefore too many IOPS
Date: Wed, 12 May 2021 16:53:50 +0000

We know that check-destination-dir is especially slow, slower than the backup. 
IIRC there is even an issue open for the regress speed. It just requires time 
to look into it, and I don't even know if an improvement is possible.

The "no such file or directory" thingy doesn't sound normal, probably worth a 
bug report...

Eric

On May 12, 2021 2:24:23 PM UTC, Andrei Enshin <and.enshin@gmail.com> wrote:
>
>Okay, seems I can see the reason of such behavior.
>
>Sorry for disturbing with such questions.
>
>We do run backup every 4 hours and seems there is 7200 seconds timeout.
>It means rdiff-backup will be killed and then we will run it again with
>`--check-destination-dir` option which causes very intensive disk usage
>by doing a lot of lstat().
>
>That is my current understanding.
>
>
>Now it is still unclear to me why the ` --check-destination-dir` does
>so many lstat() and why it fails on checking some dir:
> 
>Exception '[Errno 2] No such file or directory:
>'/some/path/rdiff-backup-data/increments/foo/aa.2021-05-12T04:15:01Z.dir''
>raised of class '<type 'exceptions.OSError'>':
> 
>>Среда, 12 мая 2021, 23:01 +09:00 от Andrei Enshin
><and.enshin@gmail.com>:
>
>>Seems it does a lot of lstat() during run with option
>`--check-destination-dir`
>>
>>Which is fallback in case backup can’t be finished. Hm.
>>
>
>>>Среда, 12 мая 2021, 22:44 +09:00 от Andrei Enshin <
>and.enshin@gmail.com >:
>>> 
>>>Hi,
>>>
>>>Thank you for the explanation.
>>>
>>>During backup rdiff-backup did lstat for
>>>/some/path/rdiff-backup-data/increments/foo/bar
>>>which returned — ENOENT .
>>>
>>>Does it mean it tried to check some file in increments which is not
>here?
>>>If it is not in increments, does it mean it was never backed up?
>>>If all above statements are true, why after backup is done, there is
>still no such file? Is it expected?
>>>
>>>
>>>I’ve just played a bit with rdiff-backup on my local.
>>>
>>># at /tmp/tmp.jondxmEQDC
>>>$ cat > a
>>>aaa
>>>^D
>>>
>>># at /tmp/tmp.zh49h057dq $ mkdir bckp
>>>$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/
>>>$ ls bckp/rdiff-backup-data/increments/
>>># empty
>>>
>>>
>>>It means after very first backup there is nothing in increments.
>Let’s add new file and do backup once again:
>>>
>>># at /tmp/tmp.jondxmEQDC
>>>$ cat > b
>>>bbb
>>>^D
>>># at /tmp/tmp.zh49h057dq
>>>$ rdiff-backup /tmp/tmp.jondxmEQDC bckp/
>>>$ ls bckp/rdiff-backup-data/increments/
>>>b.2021-05-12T22:11:02+09:00.missing
>>>
>>>
>>>I can see a record for new file with .missing suffix.
>>>
>>>However in case of `lstat()` it tries to access something which has
>not such suffix.
>>>What it tries to access?
>>>
>>> 
>>>>Среда, 12 мая 2021, 20:02 +09:00 от Eric L. Zolf <
>ewl+rdiffbackup@lavar.de >:
>>>> 
>>>>Hi,
>>>>
>>>>first, I don't see anything surprising in what you describe, so all
>>>>normal AFAICJ.
>>>>
>>>>Second, rdiff-backup needs to check each source file/directory and
>each
>>>>target, compare them and then copy (or not), so if you have some
>2300
>>>>files to backup, that would sound about right. If the target or the
>>>>source file doesn't exist, it would give an error.
>>>>
>>>>If the files are small or don't have changes, the lstat happen a lot
>and
>>>>nothing much else; this is typical random access. It gives a much
>>>>different access pattern than the copying of bigger files, where
>more
>>>>sequential is typically done to read/write the file's data.
>>>>
>>>>There is no real way to improve the situation, rdiff-backup goes as
>fast
>>>>as it can and I personally don't know an I/O-equivalent of "nice"
>(and
>>>>if you limit the I/O, the backup will be even slower).
>>>>
>>>>You could try the --no-fsync option to improve speed:
>>>>
>>>>   --fsync, --no-fsync [opt] do (or not) often sync the file system
>>>>(_not_ doing it is faster but can be dangerous)
>>>>
>>>>And, yes, the `rdiff-backup-data/increments` directory is used by
>>>>rdiff-backup to keep track of file and directory changes.
>>>>
>>>>Hope this helps,
>>>>Eric
>>>>
>>>>On 12/05/2021 07:10, Andrei Enshin via Any discussion of
>rdiff-backup wrote:
>>>>>
>>>>> Hi rdiff-backup folks,
>>>>>
>>>>> Since recent, during backing up I can see spike in IOPS up to 500
>which exhaust limit of a VM. Therefore backup process takes very long.
>I've straced a bit and what I can see is: many failed lstat() syscalls:
>>>>> % time seconds usecs/call calls errors syscall
>>>>> ------ ----------- ----------- --------- ---------
>----------------
>>>>> 42.71 0.040247 9 4608 1420 lstat
>>>>> 35.41 0.033370 12 2860 getdents
>>>>> 9.41 0.008865 6 1431 open
>>>>> 4.63 0.004363 3 1430 close
>>>>> 4.03 0.003797 3 1431 fstat
>>>>> 3.75 0.003536 2 1417 getuid
>>>>> 0.04 0.000039 39 1 unlink
>>>>> 0.01 0.000013 1 9 read
>>>>> ------ ----------- ----------- --------- ---------
>----------------
>>>>> 100.00 0.094230 13187 1420 total
>>>>> Seems rdiff-backup checks existence of some file/dir:
>>>>> 10:13:16 lstat("/some/path/rdiff-backup-data/increments/foo/bar",
>0x7ffd832fa810) = -1 ENOENT (No such file or directory) <0.000020>
>>>>> After backup is done, there is still no such file.
>>>>> Seems the part in path - /rdiff-backup-data/increments/ - is some
>"config" for rdiff-backup and probably it tryies to find something but
>can't?
>>>>>
>>>>> What might be wrong in my setup? What would you recommend to check
>to solve the issue if it is issue at all?
>>>>>
>>>>> ---
>>>>> Best Regards,
>>>>> Andrei Enshin
>>>>> 
>>> 
>>> 
>>>---
>>>Best Regards,
>>>Andrei Enshin
>>>  
>
>
>>---
>>Best Regards,
>>Andrei Enshin
>
> 
> 
>---
>Best Regards,
>Andrei Enshin
> 


reply via email to

[Prev in Thread] Current Thread [Next in Thread]