[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Atomicity idea.
Re: Atomicity idea.
30 Jan 2003 17:48:33 -0800
Derek Robert Price <address@hidden> wrote in message news:<address@hidden>...
> Kaz Kylheku wrote:
> >Crash recovery can be performed by simple shell scripts, which look
> >for the presence of these directory names and decide what to blow away
> >and what to keep.
> Better yet, always perform the operation on the duplicate tree. Then
> the duplicate copies can always be blown away when it is discovered
> their supervising process is missing.
True; even if the transaction completed fully in the duplicate just
before the interruption, you can blow it away.
However, this adds more complexity to CVS, because it has to shunt its
repository access to the duplicate. I'd rather add a little
complexitity to the simple shell scripts that do the crash recovery.
The virtue of working on the original while maintaining a backup is
that, in theory, we can modify CVS simply by instrumenting it with
advice before and after a commit or tagging operation, but otherwise
completely leave it alone.
It's like aspect oriented programming. Atomicity is modeled as an
orthogonal concern to everything else, and hacked using auxiliary
(before and after) methods on selected operations, which aren't aware
Low-risk preservation of stability is the key.
I have some more detailed ideas about the design.
The backup and rollback actions should be nested within the protection
of the CVS lock. This handles the case of concurrent commits being
tried at different levels of the same tree, and other problems.
The acquisition of the lock means that there won't be any concurrent
backing up going on. When you mkdir() the top-level backup directory,
you can be confident that nobody is trying the same thing. Moreover,
the checkout or update -d logic does not have to recognize the backup
directories and skip over them.
The backup procedure has to watch out for lock files and directories
and skip over them. The backup will not have any locks. That is
convenient---if you have to roll back, there will be no locks to
remove: just do some top-level renames, and the backup is installed in
place of the scrapped tree, unlocked and ready to go.
However, the following concern occured to me now. What if a process is
trying to acquire a lock while a transaction is rolled back? The
backup directory is then renamed to the original name, and the process
that is trying to get a lock will hang in the discarded original tree,
hammering away on a lock that doesn't exist.
I think that this case may just evaporate when that tree is
recursively deleted. The problem is that the process will be chdir'ed
into there, and hang on to the directory i-node. Maybe it can peek at
the directory link count to detect that it's in an orphaned directory
and bail out.
I don't know how various filesystems handle this. On Linux's ext2, for
instance, if your current directory has been deleted, you get EPERM
errors trying to create new directory entries. (Hence my ``evaporate''
hope in the above paragraph). But I'm not sure about this because I
tried it only interactively. There could be a race in the kernel
whereby file creation can work briefly in a blown-away directory. Yes,
in fact you have to blow away the files first, and then the directory.
This means blowing away the lock files, which could just give the
other process the opportunity to seize the lock. Ah but in that case
the recursive removal won't be able to blow away the directory. Argh.