rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] rdiff-backup bug (triggered by FUSE)


From: Yoav
Subject: [rdiff-backup-users] rdiff-backup bug (triggered by FUSE)
Date: Tue, 22 Nov 2005 04:05:15 +0200 (IST)

Hi,

I'd like to report a bug in rdiff-backup, which causes rdiff-backup to fail when when used with FUSE-based (usermode-fs) filesystems. I encountered it when trying to backup into an encrypted directory, using encfs (which runs over FUSE).

The bug is reproducable on all versions I tested, including the stable
1.0.2 and the development 1.1.2. The transcript corresponds to 1.0.1 but the effect is identical.

A few months ago, another user reported a similar problem with rdiff-backup
failing to work on encfs, but it wasn't presistent.  I have a reproducable
transcript that persistently triggers this bug, and I also figured out what
causes it, but I'm not sure how to fix it without breaking rdiff-backup.

The problem occurs when rdiff-backup attempts to backup a tree where a
previously-existed directory has been erased since last backup.  The reason
is that rdiff-backup keeps open file descriptors of pre-existing files in
the deleted directory after unlinking these files, and tries to rmdir the
directory while the descriptors are still held open.  On some filesystems
this behavior may be acceptable, but in FUSE the rmdir will fail because
an unlink operation on an opened file will just rename the file and hide
it until it is closed.  rdiff-backup tries to rmdir after unlinking, and
aborts because the rmdir fails (since directory is not empty).

FUSE has a mount-option to unlink opened files rather than rename them
upon unlinking, but this option is not recommended by FUSE developers
because it triggers some other problems.  I tried this option anyway,
and it caused rdiff-backup to break elsewhere, so I went back to FUSE's
default mode.

Even if some FUSE mount-option could solve this problem, it should probably
be solved at rdiff-backup level since rdiff-backup prides itself for
working on many filesystems, but keeping an unlinked file open and removing
its dir under it is illegal on many systems.  Anyway, I failed to figure out
why rdiff-backup needs to keep these descriptors opened after unlinking.

Below is a full transcript that shows how to reproduce the bug.
After showing it with the original rdiff-backup, I rerun it with a modified
version that spawns a shell after rmdir fails, and show what the dir looks
like, before rdiff-backup releases the file (upon exit).

I hope I provided enough information to resolve this bug.  If I can help
in any other way, please let me know.

        Yoav


The transcript:

example:/tmp/test# ### Creating the encfs ###
example:/tmp/test# encfs /tmp/test/.backup /tmp/test/backup
The directory "/tmp/test/.backup" does not exist. Should it be created? (y,n) y
The directory "/tmp/test/backup" does not exist. Should it be created? (y,n) y
Creating new encrypted volume.
Please choose from one of the following options:
 enter "x" for expert configuration mode,
 enter "p" for pre-configured paranoia mode,
 anything else, or an empty line will select standard mode.
?>

Standard configuration selected.

Configuration finished.  The filesystem to be created has
the following properties:
Filesystem cipher: "ssl/blowfish", version 2:1:1
Filename encoding: "nameio/block", version 3:0:1
Key Size: 160 bits
Block Size: 512 bytes
Each file contains 8 byte header with unique IV data.
Filenames encoded using IV chaining mode.

Now you will need to enter a password for your filesystem.
You will need to remember this password, as there is absolutely
no recovery mechanism.  However, the password can be changed
later using encfsctl.

New Encfs Password:
Verify Encfs Password:
example:/tmp/test# df backup/
Filesystem           1K-blocks      Used Available Use% Mounted on
encfs                  7872888   6314664   1158304  85% /tmp/test/backup
example:/tmp/test# ### Creating the source tree ###
example:/tmp/test# mkdir source
example:/tmp/test# touch source/file1
example:/tmp/test# mkdir source/dir1
example:/tmp/test# touch source/dir1/file2
example:/tmp/test# ### Creating the initial (full) backup ###
example:/tmp/test# rdiff-backup source/ backup/
example:/tmp/test# ls backup/
dir1  file1  rdiff-backup-data
example:/tmp/test# ### Removing dir1 and triggering the bug ###
example:/tmp/test# rm -rf source/dir1
example:/tmp/test# rdiff-backup source/ backup/
Traceback (most recent call last):
  File "/usr/bin/rdiff-backup", line 23, in ?
    rdiff_backup.Main.Main(sys.argv[1:])
  File "/usr/lib/python2.3/site-packages/rdiff_backup/Main.py", line 284, in 
Main
    take_action(rps)
  File "/usr/lib/python2.3/site-packages/rdiff_backup/Main.py", line 254, in 
take_action
    elif action == "backup": Backup(rps[0], rps[1])
  File "/usr/lib/python2.3/site-packages/rdiff_backup/Main.py", line 304, in 
Backup
    backup.Mirror_and_increment(rpin, rpout, incdir)
  File "/usr/lib/python2.3/site-packages/rdiff_backup/backup.py", line 51, in 
Mirror_and_increment
    DestS.patch_and_increment(dest_rpath, source_diffiter, inc_rpath)
  File "/usr/lib/python2.3/site-packages/rdiff_backup/backup.py", line 230, in 
patch_and_increment
    ITR.Finish()
  File "/usr/lib/python2.3/site-packages/rdiff_backup/rorpiter.py", line 251, 
in Finish
    to_be_finished.end_process()
  File "/usr/lib/python2.3/site-packages/rdiff_backup/backup.py", line 575, in 
end_process
    self.base_rp.rmdir()
  File "/usr/lib/python2.3/site-packages/rdiff_backup/rpath.py", line 808, in 
rmdir
    self.conn.os.rmdir(self.path)
OSError: [Errno 39] Directory not empty: 'backup/dir1'
Exception exceptions.TypeError: "'NoneType' object is not callable" in <bound method 
GzipFile.__del__ of <gzip open file 
'backup/rdiff-backup-data/file_statistics.2005-11-22T01:48:39+02:00.data.gz', mode 'wb' at 0x56c88ba0 
0x56bcd12c>> ignored
Exception exceptions.TypeError: "'NoneType' object is not callable" in <bound method 
GzipFile.__del__ of <gzip open file 
'backup/rdiff-backup-data/error_log.2005-11-22T01:48:39+02:00.data.gz', mode 'wb' at 0x56c88720 
0x56bcef4c>> ignored
Exception exceptions.TypeError: "'NoneType' object is not callable" in <bound method 
GzipFile.__del__ of <gzip open file 
'backup/rdiff-backup-data/mirror_metadata.2005-11-22T01:48:39+02:00.snapshot.gz', mode 'wb' at 
0x56c88760 0x56bcddec>>ignored
example:/tmp/test# example:/tmp/test# ### at this point, I modified rdiff-backup (in another window) to fsync,sleep,retry, and show dir content on every step. after failure, rdiff-backup will spawn a shell before exiting. ### example:/tmp/test# example:/tmp/test# rdiff-backup source/ backup/
Previous backup seems to have failed, regressing destination now.
DEBUG: before fsync
['.fuse_hidden0000003a00000002']
DEBUG: Forced fsync (sleepy_rmdir)
DEBUG: after sleep
['.fuse_hidden0000003a00000002']
DEBUG: still failed.  Spawning a shell.
['.fuse_hidden0000003a00000002']
example:/tmp/test#
example:/tmp/test# ### We're now in a child shell while rdiff-backup is still 
up and running. ###
example:/tmp/test# ls -la backup/dir1/
total 8
drwxr-xr-x    2 root     root         4096 Nov 22 01:50 .
drwxr-xr-x    4 root     root         4096 Nov 22 01:47 ..
-rw-r--r--    1 root     root            0 Nov 22 01:47 
.fuse_hidden0000003a00000002
example:/tmp/test# exit
exit
example:/tmp/test# ### Now rdiff-backup has exited ###
example:/tmp/test# ls -la backup/dir1/
total 8
drwxr-xr-x    2 root     root         4096 Nov 22 01:52 .
drwxr-xr-x    4 root     root         4096 Nov 22 01:48 ..
example:/tmp/test# example:/tmp/test# ### At this point, the hidden file no longer exists because rdiff-backup exited and doesn't hold the fd of the unlinked file open anymore. ###





reply via email to

[Prev in Thread] Current Thread [Next in Thread]