[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Gnu-arch-users] towards standards specifications
From: |
Tom Lord |
Subject: |
[Gnu-arch-users] towards standards specifications |
Date: |
Wed, 27 Aug 2003 09:57:48 -0700 (PDT) |
An idea is being kicked around the list to try to make the conventions
for file tagging a standard across multiple revision control systems.
A few words about that are enclosed. I've tried to give a high-level
overview of the design space questions and sketch in some history
about previous attempts to standardize.
* Things to Standardize
There are several things to standardize, not just tags. The way in
which the "set of things to standardize" breaks down into atomic units
of "proposed standard" is not arbitrary.
I think the things to standardize are:
[A] `inventory'
[B] `mkpatch/dopatch' and changeset format(s)
[C] archives and project trees:
- a global namespace and taxonomy for revisions
- a format and semantics for log files
- the format and semantics of the in-tree patch log
- a transport-independent spec of basic archive transactions
[D] mappings of basic archive transactions onto transports
[E] an extension mechanism for adding additional archive transactions
If you imagine a world of sweetness and light in which those five
things are standards, then what is tla? It's (currently) a particular
implementation of [A..D]. The set of standards described by [D] could
grow -- in which case tla (if unchanged) would be an implementation of
[A..C] with some of [D].
I'll go into greater detail about each of those areas below. I'll
state at the outset though, that these standards don't necessarily
completely specify a revision control system. For example, if
Subversion supported [A..C] and some of [D], it could, nevertheless,
do more in addition to that.
In other words, it wouldn't be the aim of these standards to turn all
systems into arch. Rather, it would be the aim to turn arch into a
collection of methods for interoperability between revision control
systems: methods that happen to be implementable in stand-alone form
in a very easy way. It would turn out that revision control systems
that employ these methods obtain, as a side effect, the features of
distributed operation, smart merging, and easier integration with
ancillary tools.
v
* Design Space Notes and Past History of Standardization Efforts
[A] `inventory'
including the `=tagging-method' functionality and the
representation of individual file tags. Those two
go together.
Interesting dependencies: the choice of regexp language;
filename syntaxes and semantics on various platforms.
Problems encountered starting standard discussions with
Subversion and Darcs:
(A1) representatives of both projects maintained that the
"internal" history records of the revision control system
eliminate the need to assign files a "logical identity".
(I am not convinced that their ideas in this regard are
both useful and coherent, but I suppose time will tell.)
(A2) neither project makes significant use of naming
conventions, as far as I know.
(A3) the most immediate reasons for such a standard,
to enable interoperability and to enable the exchange
of changesets, did not appear to have appeal to either
project.
Discussion on these topics was not extensive. There appeared to
be greater interst in standardizing a changeset format. I
reasoned that in any effort to standardize a changeset format,
the need for an inventory mechanism would become clear and that
the question could then be raised in the context of that
motivation. Discussion never got that far (see below).
Unexplored: I haven't reached out much to the meta-cvs project.
I _think_ I recall that it _does_ have a concept of logical
file identifier, so that might be interesting to look into.
Possible partial solution to problems: as with putting 'arch-tag:'
lines in CVS-managed Emacs sources, the cooperation of other
projects is not necessary in this area (though it would benefit
both those other projects and users).
It _might_ be intersting to factor out `inventory', `mkpatch',
and `dopatch' into a separate distribution.
[B] `mkpatch/dopatch' and changeset format(s)
Of particular interest, imo, is the behavior of `dopatch'
for inexact patching and the implications that has for
what makes a reasonable changeset format.
It seems wise, in light of the different uses for changesets, to:
describe the format in terms of an abstract syntax; describe the
behavior of mkpatch/dopatch in terms of the abstract syntax;
provide multiple surface syntaxes for the abstract syntax.
Interesting dependencies: [A], above; diff and patch tools or
standards; interfaces to external merge tools.
Problems encountered starting standard discussions with
Subversion and Darcs and non-RCS-implementors:
(B1) representatives of both RCS projects reject the importance
of inexact patching. Subversion developers argue that
the internal history records of the revision control
system should be used, in combination with a family of
techniques called "variance adjustment", to turn
inexact patches into exact patches. The Darcs project
has a similar view.
Representatives of arch pointed out that (a) variance
adjustment has significantly different and arguably often
less useful semantics; (b) in forseeable situations where
history is not available, variance adjustment doesn't
apply usefully at all.
Discussion degenerated after that into arguments about
whether or not history would always be available.
(B2) A very popular intuition is that a whole-tree changeset
is basically a shell script (using `mv', `rm', etc.)
plus a set of ordinary diffs. Discussion frequently
got bogged down on questions of the best surface syntax
for such changesets.
Representatives of arch tried to point out that,
especially when inexact patching is considered, that
intuition is wrong. Discussion should start, we argued,
with consideration of the _semantics_ of mkpatch/dopatch
and development of an abstract syntax to describe what
is exchanged between those two tools.
Discussion frequently got bogged down into questions such
as "Well, suppose we use MIME for this format" or "suppose
it is an actual shell script".
(B3) Another very popular intuition related to (a) and
(apparently) to the behavior of Bitkeeper is that the
changeset format must make special provisions to carry
revision control history. For example, it was frequently
suggested to add "fields" to carry a revision name and a
log message. Part of the motivation here seemed to be
"I expect to see that information when somebody sends me
a changeset in email."
Representatives of arch pointed out that such components
are _not_ necessary in changesets and, in fact, make no
sense when changesets are used outside the context of a
revision control system. Furthermore, such content could
clearly be _layered_ on top of changesets in a second
standard. Finally, one could not reasonably and usefully
address such fields without dragging in new questions
about the syntax and semantics of revision names and log
messages -- so layering would help to postpone and
separate consideration of those issues.
Discussion along these lines typically spiraled into
discussion of (B1), above.
Possible solution: It _might_ be intersting to factor out
`inventory', `mkpatch', and `dopatch' into a separate
distribution, especially in combination with a mail-friendly
syntax for changesets. _IF_ these tools were more widely seen
as independently useful, and _IF_ they were more widely adopted,
then perhaps developers of other RCS systems would start getting
questions like "why doesn't svn work correctly after I apply a
changset to my tree?"
[C] archives and project trees:
- a global namespace and taxonomy for revisions
- a format and semantics for log files
- the format and semantics of the in-tree patch log
- a transport-independent spec of basic archive transactions
No detailed discussions got this far.
Most of those items are hopefully pretty clear. By
"a transport-independent spec of basic archive transactions"
I mean:
(a) a taxonmy of revision types (import, commit, tag)
(b) transport-independent specifications of the basic
transactions (e.g., a `commit' requires a .tgz of the
changeset, a copy of the log file, the name of the
revision to create -- it returns a status code which
may be any of revision-locked, no-such-category,
no-such-branch, etc.)
Past experience suggests that the namespace question has _no_
answer that will satisfy all intuitions of "good", and _many_
answers (arch's being one of them) that will satisfy essentially
all actual needs. In other words, it's just about a perfect topic
to discuss endlessly and the only chance for a standard is to have
a standard body who's motivation going in is to pick something with
as much wisdom as they can muster and withstand years of subsequent
flaming.
[D] mappings of basic archive transactions onto transports
No detailed discussions got this far.
Evidence from [gnu-]arch-users is that people too often assume
that the on-disk arch archive format is essential to arch.
In my view, it is _not_, however, it is evidence of a good
choice of basic archive transactions and it _may_ prove to be
the best format in the long run.
[E] an extension mechanism for adding additional archive transactions
No detailed discussions got this far. I have a couple of ideas
but I won't go into them just yet.
- [Gnu-arch-users] Re: tagline robustness, (continued)
- [Gnu-arch-users] Re: tagline robustness, Tom Lord, 2003/08/27
- Re: [Gnu-arch-users] Re: tagline robustness, Jason McCarty, 2003/08/27
- [Gnu-arch-users] Re: tagline robustness, Miles Bader, 2003/08/27
- [Gnu-arch-users] Re: tagline robustness, Maksim Lin, 2003/08/28
- Re: [Gnu-arch-users] Re: tagline robustness, Robert Anderson, 2003/08/28
- Re: [Gnu-arch-users] Re: tagline robustness, Zack Brown, 2003/08/28
- Re: [Gnu-arch-users] Re: tagline robustness, Jason McCarty, 2003/08/28
- [Gnu-arch-users] file and directory restrictions, Zack Brown, 2003/08/28
- [Gnu-arch-users] Re: tagline robustness, Miles Bader, 2003/08/28
- [Gnu-arch-users] Re: tagline robustness, Jason McCarty, 2003/08/28
- [Gnu-arch-users] towards standards specifications,
Tom Lord <=
- Re: [Gnu-arch-users] towards standards specifications, MJ Ray, 2003/08/27
- Re: [Gnu-arch-users] towards standards specifications, Jason McCarty, 2003/08/27
- [Gnu-arch-users] Re: towards standards specifications, Miles Bader, 2003/08/27
- Re: [Gnu-arch-users] Re: towards standards specifications, Andrew Suffield, 2003/08/28
- [Gnu-arch-users] Re: towards standards specifications, Miles Bader, 2003/08/28
- Re: [Gnu-arch-users] towards standards specifications, Tom Lord, 2003/08/27
- Re: [Gnu-arch-users] towards standards specifications, MJ Ray, 2003/08/27
- Re: [Gnu-arch-users] Re: tagline robustness, Stephen J. Turnbull, 2003/08/26