[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Xdelta and CVS
David H. Thornley
Re: Xdelta and CVS
Thu, 19 Apr 2001 15:47:14 -0500
"Greg A. Woods" wrote:
> [ On Thursday, April 19, 2001 at 11:16:31 (+0200), Maarten de Boer wrote: ]
> > Subject: Xdelta and CVS
> > We are using CVS, for several projects, with great pleasure.
> > We now have the need to store and track revisions of large
> > binary files (audio analysis data). Because we are already
> > familiar with CVS, and use it with clients on various
> > platforms, we would like to use CVS for this data as well.
> That would be a "not-very-smart" thing to do. CVS does not in any way
> meet your requirements for that kind of data.
More accurately, it meets requirements in a rather bad way, using
a lot of disk space and offering little benefit you wouldn't get
by gzipping and backing up the data regularly.
CVS exists to allow concurrent development involving incremental
changes to files. Is this useful on the analysis data?
> > Obviously, storing all revisions entirely will not be very
> > efficient. The data is pretty straigthforward, and the
> > differences between versions could be extracted very well
> > with Xdelta. So Xdelta integration in CVS seems to be the
> > solution.
> Sure, but if you do that then you'll be off using an incompatible branch
> variant of CVS that has no hope of ever being integrated back into the
> main variant of CVS that uses standard RCS files for storage. Note also
> that you cannot change what is considered to be "the main variant of
> CVS" unless you convince the majority of CVS users to give up on RCS as
> the sole repository file format either.
Um, what's so sacred about RCS file format? I realize that file
formats are to be changed only with caution, but since the entire
functionality is internalized into CVS (as of 1.10, I believe)
there is no reason why it cannot be changed for a good purpose.
> The second idea is just plain wrong in claiming that it would not change
> the CVS repository format since it would, by definition. RCS uses
> "diff" and only "diff" for delta storage. What it really proposes is to
> change RCS.
No, what it proposes to do is to replace RCS. I thought that the
essence of CVS was something other than its file format.
> Overall what that web page fails to note is that introduction of such a
> drastic change to the repository data structures would make any such
> repository incompatible with any normal RCS-only version of CVS, and
> indeed incompatible with RCS itself.
Perhaps the web page author considered that it would be obvious
to the intended reader (i.e., somebody considering development
work on CVS). In any case, I really don't care about compatibility
with RCS. RCS is on my list of stuff I never want to use again,
not that far below COBOL.
> BTW, that web page also fails to give full justice to the size of the
> project. If you're really serious about something along these lines
> you'd be INFINITELY better off if you simply started a new design for a
> versioning tool and wrote it right from scratch.
I'm not all that familiar with CVS internals (not having had to
mess around with it like I did Gnats), but it seems to me that
we're talking about changing the repository format, nothing else.
If this is a really large project, then CVS is very badly
(Now, test and validation would be time-consuming.)
There is the obvious need for both-way conversion programs, but
after that I think the Xdelta version would see fairly rapid
acceptance. (How rapid depends partly on how effective the
merging was, which is to say whether two changes in a file
can be merged to produce another useful file. This would
obviously depend partly on the file format, and I'm not
an authority on common binary file formats.)
> > I am well aware of the fact, that CVS has not been designed
> > to deal with large binary files, and that some people would
> > consider it undesirable to add such functionality. I think
> > it is worth the try though. As this is an important issue
> > for our project, I can spend some time on this.
> You've obviously been blinded by your situation. What you're proposing
> is somthing new and entirely different which may have a command-line
> like that of CVS, but which would otherwise not be CVS. Calling it
> "CVS" would be wrong.
Given a copy of CVS, and a copy of XCVS, with the ability to use
both but not examine repository format or source code, could
you necessarily tell the difference? If properly implemented,
it seems to me that the changes could be mostly invisible.
Given that this could be a plug-compatible replacement to
everybody but the admins, why would it be new and entirely
David H. Thornley Software Engineer
at CES International, Inc.: address@hidden or (763)-694-2556
at home: (612)-623-0552 or address@hidden or