info-cvs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Copying parts of a repository, excluding revisions post-TAGNAME...


From: Mark D. Baushke
Subject: Re: Copying parts of a repository, excluding revisions post-TAGNAME...
Date: Wed, 31 Dec 2003 12:47:52 -0800

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Leander Hasty <address@hidden> writes:

> As part of contracted work on a product, we received a tarball of the
> project source and data from another company.
> 
> We'd very much like to have the entire revision history of all of these
> files; the other company is willing to give us the directory trees for
> the project from their CVS repository.  From what I understand, I could
> normally just accept a tarball of said directories and manually insert
> them into a local CVS repository and possibly edit the CVSROOT files a
> bit.

Yes.

> However, we're not entitled to any of their source on this project past
> a certain release, which is tagged in their repository.

Yes, that starts getting tricky... :-)

> Digging through "man 5 rcsfile" and "doc/RCSFILE", it seems like it
> would be possible to write a script to process each ,v file, determine
> the version associated with the TAGNAME, and then trim out any entries
> in the rcs file that have a version greater than this number.  

Yes, that should be possible.

> I'm more than a bit concerned about the difficulty of creating a
> robust script which can be used on a large (tens of GB) CVS
> repository, remotely, by someone else (who is inexperienced in CVS
> administration).
> 
> So, questions:
> 
> - Does there exist a CVS command, client, utility, or script (even
> third-party) which already has this capability, or some form thereof?

I am not aware of a program or utility that does this exactly... most
such programs are not interested in losing recent changes. :-)

There is a script that examines every version in your repository written
by Donald Sharp called contrib/check_cvs.in intended to check the
integrity of the Repository. You might be able to start with this
utility to hack something to do the job for you.

> - If not, given that I'd only have a day or two to roll my own utility,
> what sort of hurdles should I expect?  (Is it possible?)

Sure, it is possible. Going to the latest version on a given branch and
doing an 'rcs -ox.y file,v' will remove version x.y from the repository.

>   - Does the rcsfile format (or the way CVS uses it) guarantee
> chronological ordering of the "delta" and "deltatext" entries in the
> file?

No, timestamps may be spoofed in a generic RCS file format. In a typical
case, you would only see a 1.1.1.10 having an older timestamp than a
1.1.1.9 version if an import was done using the 'import -d' flag and
the timestamp of the file was older than the version imported as 1.1.1.9.

The other oddity you need to be aware of is the 'dead' state which is
how versions are marked as removed from the tree even if they are later
resurrected.

You should not remove version 1.1 of a file if there are any branches
that exist even if version 1.1 is dead.

However, version numbers will always be increasing for new commits even
if there are discontinuities in the numbers.

   1.1 1.3 1.4 1.4.0.2 (magic branch tag) 1.4.2.1 1.4.2.2 
   1.4.2.2.0.2 (magic branch tag) 1.4.2.2.2.1 1.4.2.2.2.2 ...

>     That is, can I expect any checkins post-TAGNAME to be after the tag
> entry in the ,v file, in each of their respective sections?

This clarification is more confusing than the original question. I am
not sure if I have actually answered your question or not.

> 
>   - What other data in the ,v file would I have to change if I could
> delete all of the post-TAGNAME entries?  ("head", probably...
> "symbols"?)

Removing tags for removed branches and versions would likely be a good
idea.

> - Can anyone suggest any more sane way to do this?

Given an rcs file,v file and a TAG

 - do an 'rlog' command on the file,v and collect all of the <TAG
   version> associations and the list of all of the versions.

 - determine the version number of file,v that matches TAG

 - if the version is of the form w.x.y.z, start by looking for all
   w.x.y.N versions. If z is greater than N, then put w.x.y.N into the
   'keep' list of versions, otherwise the version will need to be put
   into the 'discard' list of versions. When you have exhausted all
   w.x.y.N versions, move to w.N versions following the same rules as to
   keeping w.N if x is less than N and discarding it if N is larger.
   If w is not 1, then the next step is to look for w-1.N versions
   and you will probably need to keep all of them until you reach
   the first version 1.1 for the file.

 - now that you have your discard list of versions, sort them
   numerically. 

 - prune any version tag that references a tag to be deleted.

 - locate any branch tags of the form w.x.0.even and if all w.x.even.N
   versions are being removed, you should remove the branch tag for
   that branch too.

  - now begin pruning the versions starting with the largest number of
   elements so prune w.x.y.27.2.1 before w.x.y.27 which should be pruned
   before w.x.y.26 and so on until all you have left are the versions
   that were direct ancestors of the TAG you were given.

In theory, if you have run into a 'dead' version while moving backward
any predicessor version to the dead version should also be marked for
removal... This assumes that the version of the file that was dead was
not really related to the current version in the repository. In some
development models this may be true, in others, it may not be true and
the versions before the 'dead' version may contain much needed history
information. So, I would suggest you not actually prune those dead
versions unless forced.

Note: If they have a recent enough version of CVS and all of the
versions of interest are on the mainline of the repository, then getting
the list of versions of interest may be as easy as 'cvs log -r:TAG' but
be advised that I believe some older versions of cvs did not report all
versions correctly in some cases and if TAG is on a branch, the log it
will stop at the origin of the branch point.

> - What documentation should I be looking at, aside from that mentioned
> above?

The doc/RCSFILES that comes with the cvs distribution is a good source.

        Good luck,
        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQE/8zX43x41pRYZE/gRAu25AKDMxCmCdiAGHHw2XBYwxQYRweOqoQCbB+av
bAEtBWrRv9ypMeKuYOBIIqY=
=Yze/
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]