[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] [Monotone-users] CAD versioning
From: |
Hendrik Boom |
Subject: |
Re: [Monotone-devel] [Monotone-users] CAD versioning |
Date: |
Fri, 13 Dec 2013 13:03:28 -0500 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Dec 13, 2013 at 06:16:01PM +0100, Thomas Moschny wrote:
> Hi Hugo,
>
> > Really?? It would be a surprise to me that monotone's delta algorithm
> > would only be efficient for text files, because I have been using
> > monotone for many years on images and pdf files without problem
> > regarding performance.
> >
> > I thought monotone uses xdelta, which is a binary delta algorithm that
> > facilitates binary merges that can be easily applied to both text and
> > non-text files.
> >
> > Am I right, or is monotone's delta algorithm only efficient for text
> > files?
>
> These are two different things. The fact that Monotone uses xdelta to
> efficiently store different versions of a file is an implementation
> detail, that is not (should not) be visible to the user (well, besides
> the fact that it saves disk storage).
>
> An automatic merging attempt on the other hand happens whenever there's
> a conflict for text files to be solved, and this merging attempt is
> line-based.
>
> Do not confuse them, they have nothing to do with each other.
>
> That said, if I remember correctly, one could hook in any other method
> for trying an automatic merge in case of a conflict on file contents,
> and that method could in theory also handle binary files (like zipped
> xml and the like).
Yes, exactly correct. Or at least, as I recall, too.
There is a merge hook. And it could be user-coded to check on file
type and subsequetly use the default merger, or any other,
The big question is whether custom mergers have been written for
particular file types. And whether they are easy to write. SOmetimes
a back-and-forth conversion to a more mergable file type is the best
remmedy.
But let me discuss how bad this can get,
There's trouble when merging files with large, heavily nested bracket
structures. The default line-by-line mechanism can end up with
mismatched brackets. Now this is no problem when processing, say, C
code, because the textrual level is the level at which human beings
intereat with the code in an editor, even though the compiler may barf.
But with syntactic trees that the user never sees, such as the
einormousy complicated word-processor data structures, when the word
processor barfs, the user is clueless.
The answer would seem to be to do write secialized merge tools that
respect brackets. This can be done, say for an XML file that consists
of a sequence of records (with one XML tag) containing fields (of
specific other XML tags) which may contain some further structure.
In other words, where every kind of entity has its own fixed place in
the nesting hierarchy, and they are normally kept in the same order.
But for bracketed tree structures where the nesting can vary, where you
cah, say, wrap an extra while loop around a bunch of statements, there
is no known efficient algorithm (at least the last time I looked;
anybody know better?).
The only solution as far as I can see to such situations is to change
the way that data structures are written to a file. Perhaps give every
possibly nested entity a unique ID, which you write out with the
entity to give the merge program something to synchronise on. Links to those
entities are then written by writing the unique ID.
The application software must then rigorously maintain these unique IDs
through processing, editing, and the like,
The application can then write all these entities out in, say,
alphabetial order.
Such a file would likely be very mergable.
Though you' may still have to worry about parts of the data structure
accidentally becoming becoming disconnected from the rest.
-- hendrik