cvs-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Cvs-dev] Re: valid tag identifiers


From: Mark D. Baushke
Subject: [Cvs-dev] Re: valid tag identifiers
Date: Sun, 25 Jun 2006 02:54:25 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jim Hyslop <address@hidden> writes:

> Mark D. Baushke wrote:
> > Hi Folks,
> > 
> > Okay, I have a hack on top of Jim's latest patch which tightens up
> > error checking on valid tag identifiers which he credits to a report
> > by Alan Harder <address@hidden> that I apparently missed seeing.
> 
> The discussion on this was almost a year ago. The original discussion
> can be found here
> http://lists.gnu.org/archive/html/bug-cvs/2005-07/msg00044.html and
> here: http://lists.gnu.org/archive/html/bug-cvs/2005-08/msg00005.html
> 
> I had written the patch, but was having troubles getting it to run under
> Cygwin. I put it aside, and forgot about it until recently.

Thank you for the URLs.

> > Given that I know of multiple sites that converted from RCS to CVS, it
> > is not reasonable to tell them that all of their old tag names are no
> > longer available to them. So, I suggest that a checkout or update using
> > the old tags should still work. I am not as certain what to do if a
> > branch tag is being used and someone tries to add a new file to the
> > branch... I suspect that this current set of patches will disallow
> > adding the new files to the 'illegal' tag in that case. However, I have
> > not yet written any tests for that situation.
> 
> [...]
> 
> > To be honest, I am still not sure I consider the restrictions to CVS
> > tags as something other than RCS tags to be wise. How many times are
> > folks using 8-bit characters out of the ISO 8859-1 or ISO 8859-5
> > character set rather than the "C" locale in tag names?
> I took the approach of making the code conform to the documentation,
> which says that only letters, numbers, underscores and dashes are allowed.

Yup.

> Perhaps the better approach would be to revert the changes, and update
> the documentation. Does anyone know why CVS tightened the restrictions
> on what's allowed? 

To the best of my understanding, CVS only implemented the RCS
restrictions and did not directly impose any others on top in the code.

> I can't see offhand any particular reason for not
> allowing basically any character, except whitespace.

Well, actually, part of the restrictions are based on the internal format
of the RCS ,v files. 

Invalid characters are all considered 'special' characters in an RCS
file format file:

   $      The '$' is used as a prefix for keywords
   ,      The ',' character is used in lists of attributes
   .      The '.' character is used in identifiers, but not symbols.
   :      The ':' character is used to separate name:value pairs.
   ;      The ';' is used to separate clauses.
   @      The '@' character is used as a string delimiter in too many places.

   spaces The space, backspace, tab, neline, vertical tab, form feed, and
          carriage return (collectively, white space) has not significance
          except in strings and may not appear within an id, num, or sym.

The '/' character used to gives the client/server protocol fits due to
how CVS/Entries were transmitted. I believe this has been 'fixed' by
now. So, names like c/d are okay...

I suppose that CVS should have considered prohibiting other tags, but
it was always based on the RCS format rather than having its own CVS
requirements...

Of course, there are also two tags that are 'reserved' for use by CVS.
They are 'BASE' and 'HEAD' which may not be directly added to an RCS
file, but instead mean the checked out baseline revision and the tip
revision for the current branch respectfully.

There may also be problems if a particular LOCALE is in use by one
committer and another in use by the other. At present, CVS, unlike
CVSNT, does not save files in unicode format internally, so 'odd' things
could potentially happen depending on the UNICODE characters chosen in a
tag name.

> > All of the characters with umlaut's and the like become illegal with
> > Jim's original change and I suspect this may be highly undesirable.
> 
> I think that may qualify for understatement of the year :=)

:-)

        -- Mark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (FreeBSD)

iD8DBQFEnl1RCg7APGsDnFERAjZbAJ4/6LrMfX5tHXtFbFKNBH4EULz9RwCgjy99
OWwQvCJM0gje7vxmAONDJnE=
=FOE9
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]