[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: renaming under CVS

From: Noel Yap
Subject: Re: renaming under CVS
Date: Mon, 4 Mar 2002 13:02:57 -0800 (PST)

--- Paul Sander <address@hidden> wrote:
> >--- Forwarded mail from address@hidden
> >Each file and directory are mapped to a ,v archive
> >file.  The contents of the directory archive files
> are
> >the mappings of its elements and the types (eg file
> or
> >directory) of those elements.  The basenames of the
> >archive files will be hex representations of random
> >256-bit numbers generated with a "secure" version
> of
> >the Mersenne Twister algorithm.
> Why not just sequentially number the containers?  Or
> use a timestamp
> plus random element to name them?

Sequential numbers have drawbacks.  The computation
slows down unless you save state.  If you save state,
the state can get screwed up.

I'll consider using a timestamp portion in the archive

> >Locking will occur on a per-repository basis. 
> >Permissions can still be done on a per-directory
> >basis.
> Permissions on a directory basis are tough if files
> are linked to
> multiple directories that have different
> permissions.

I am satisfied with how permissioning is done now (ie

> >Since the repository structure will no longer be
> >directory-based, module definitions like "module
> >path/to/module" won't be supported.
> Correct, module definitions become implicit in the
> directory mappings.
> However, there's still a gotcha at the top level. 
> The things you give
> as arguments to the "cvs checkout" command need to
> be treated specially
> in some way so that they can be located correctly. 

Yes, there'll need to be a repository-level mapping. 
In a way, the repository is already considered to be a
module since you can "cvs co .".  Or am I mistaken?

> Limiting operations
> to adds and renames (without replacement if the
> target already exists)
> is a start when considering this.

I don't understand, can you elaborate?

> Also, I was considering using a special container
> name of "0" to locate
> the top-level definitions.

"0" (well, more likely 64 0's) sounds good to me. 
There also needs to be a way to add/remove from this
top-level list.  What do you think of switches to
"add" and "rm"?

Also, I'm not a fan of completely wiping out archive
files, but I can see a need for it.  This also needs
more consideration.

> >I think a transition from an old repository to the
> >above shouldn't be too bad assuming people don't
> have
> >complicated module definitions.  For those with a
> >complicated module definitions, a switch could be
> >provided to use the old style (the default would be
> to
> >support backward compatibility).  A tool can also
> be
> >provided to convert the old repo into a new repo.
> I think that mapping the modules database to the new
> structure is the
> easier of two problems faced when converting.  The
> other is mapping
> the existing directory-based mapping to the new one,
> considering dead
> and resurected revisions and so on.

Yes, I'll probably attack this in the second phase
since more will be known to me at that time.

> >Old clients will still work on new servers but
> since
> >the mappings will be done by the server, they'll be
> >slower than new clients.  New clients will store
> the
> >mappings within the CVS directories.  This implies
> >that the CS protocol will need to be extended in
> such
> >a way that a new server will recognize a new
> client. 
> >If the client can query the server for its version,
> >new clients can also work with old servers.
> I don't really know enough about the protocol to
> comment on this, but I
> suspect that the current mapping is somehow implicit
> in its implementation.
> I would assume that the client/server protocol would
> need to be redesigned
> as well, thus making current clients incompatible
> with new servers.

The details of this will surface during development.

> >The command "cvs mv" will be added.  Upon checkin,
> a
> >mv command will checkin a new version of the
> archive
> >file(s) of the affected directory(ies).
> A "cvs ln" is needed as well, to copy CVS meta-data
> from one project
> to another for when artifacts become shared.  A
> variant might also be
> needed that accepts container names and creates the
> proper mapping for
> the sandbox.

I'll have to think about thus one.  I'm inclined to
say, "one thing at a time".

I do know that CC uses "ln" to resurrect files.  I
never really liked this (since it's not so intuitive),
but this need still exists so I'll try to find some
other way to address it.

> I've been considering a few issues with regard to a
> new implementation.
> First, it's not necessary to lock the RCS files at
> all for read-only
> operations if version numbers are known beforehand,
> or some other means
> of identification is available (e.g.
> branch/timestamp pair).  It might be
> possible to implement a lock-free mechanism to
> control access to the
> repository.

I'm not sure I understand what you mean by this, but
it doesn't sound like it belongs on a "cvs mv" patch.

> That said, I've come up with a per-file based
> locking mechanism that might
> work (but it's inefficient because it's filesystem
> based).  It involves
> creating a hard link to an RCS file when we want to
> commit a change, use RCS
> on the link to record changes to the container, then
> rename the updated RCS
> file back to the original place as the commit
> completes.

I have a couple of problems with this:
1. Hard links aren't portable.
2. I've crashed OS's with simultaneous hard links.

> This is essentially a two-phase commit
> implementation, which has the
> potential to make commits truly atomic.  (This
> actually applies to all
> changes to RCS files, including tagging!)  What's
> missing is a transaction
> log that records all of the affected RCS files and
> crash recovery tool that
> either removes or renames the linked RCS files
> depending on how far down the
> commit path someone got at the time of the crash. 
> But that's easy to
> implement as well.
> The down side is that for each file affected,
> there's a requirement of
> three times the size of the RCS file per file, plus
> double the aggregate
> of all of the RCS files updated.
> Another annoyance is that things like "cvs log" that
> operate on sets of
> revisions may produce unwanted results, particularly
> if they run concurrently
> with transactions that later abort.  But I think
> that this problem can be
> solved as well.

I'd rather discuss "cvs mv" at this point (at least on
this thread).

> I dug up Dick Grune's third release of his original
> CVS implementation and
> will try some experiments as time permits.  It's a
> bit of a shock going back
> to one's roots like that, and there's not a lot of
> similarity between it and
> what we now know as CVS.  But it's a good place to
> start tinkering.

Why not use the latest dev snapshot?


Do You Yahoo!?
Yahoo! Sports - sign up for Fantasy Baseball

reply via email to

[Prev in Thread] Current Thread [Next in Thread]