[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] ridff-backup 'hangs' on certain file

From: Danilo Godec
Subject: [rdiff-backup-users] ridff-backup 'hangs' on certain file
Date: Sat, 15 May 2010 12:42:12 +0200
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv: Gecko/20100317 Thunderbird/3.0.4


recently my 'rdiff-backup' developed a weird problem, where it 'hangs'
when backing up a certain file on one server. Other 40+ servers are OK,
it's just that one and even that is only happening since May 3rd....

I can 'cat' the file on the originating server, I can also 'scp' it on
the backup server - there is no problem, no error with that. However -
when 'rdiff-backup' gets to this file, it just 'hangs' and does nothing.

On the backup server I see the file 'rdiff-backup.tmp.22397' which seems
the be a partially transferred original file (524288 bytes vs. 785592
bytes of the original file).

If I 'strace' the 'python' process on the backup server, I get this:

> # strace -p 7343
> Process 7343 attached - interrupt to quit
> read(5, ^C <unfinished ...>
> Process 7343 detached

If I strace the 'ssh' process', I get this:

> # strace -p 7344
> Process 7344 attached - interrupt to quit
> select(7, [3 4], [], NULL, NULL^C <unfinished ...>
> Process 7344 detached

And that's all, there is nothing else going on even if I leave 'strace'
open for 30 minutes...

And if I 'strace' the 'python' process on the originating server, I get

> # strace -p 20518
> Process 20518 attached - interrupt to quit
> read(3,  <unfinished ...>
> strace: ptrace(PTRACE_CONT,1,133): Input/output error
> Process 20518 detached

After that the process state in 'ps' changes from 'Ss' to 'Ts'
(stopped). I can change it back to 'Ss' with 'kill -CONT', but it still
doesn't do anything.

The weird thing is that it ALWAYS happens on the same file, but there is
seemingly nothing wrong with that particular file...

Any ideas? What else is there to try and get more clues?


PS: OS of the backup server is OpenSuSE 11.1 (32 bit), OS of the
'backed-up' server is CentOS release 5.2 (64 bit). Rdiff-backup version
on both is 1.2.8. I also tried removing 'rdiff-backup-data' to start all
over, but it didn't help.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]