monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 3-way merge considered harmful (was Re: [Monotone-devel] merge weird


From: Nathaniel Smith
Subject: Re: 3-way merge considered harmful (was Re: [Monotone-devel] merge weirdness...)
Date: Mon, 2 May 2005 15:24:54 -0700
User-agent: Mutt/1.5.9i

On Mon, May 02, 2005 at 01:57:51PM -0700, K. Richard Pixley wrote:
> Nathaniel Smith wrote:
> 
> >To expand: I'll assume everyone knows about the criss-cross merge
> >case, which forces us to choose more distant ancestors in some cases,
> >or else risk silently corrupting code.
> >
> I don't.  Is there a reference?

The email I was replying to described it:
  http://article.gmane.org/gmane.comp.version-control.monotone.devel/3256

> >(This problem seems specific to the semantics of "rename"; there
> >doesn't seem to an analogue for file content, because file content
> >cannot be moved, only created or destroyed.
> > 
> >
> How is this different from the following content case...
> 
> A =
> y
> B =
> x
> y
> C =
> y
> x
> D =
> x
> y

Hmm, cute.  This seems more like a variation on the zombie hunk
case than a variation on the disconnected rename case to me, but it
doesn't really matter where you lump it :-).  (The problem with the
renames isn't that it's bad to have two files in the result; it's that
we lost track of the fact that they're really the same file.  "the
same" line isn't defined for 3-way content merge, so you can't have
exactly the same problem.)

> ps, is there a reference on the codeville merge?

Not really.  Bram and Ross keep saying they'll write one, but they
haven't yet.

The basic idea is:
  - take the two files to be merged
  - break them into lines
  - "align" those lines, i.e., figure out which lines on one side
    correspond to which lines on the other.  traditional cdv-merge
    uses diff to do this, "precise" cdv-merge (under development) does
    something smarter and history-based.
  - find all the parts of the two files that didn't get aligned, and
    figure out which unaligned hunk of each file corresponds to which
    unaligned hunk of the other file.  you can think of this as doing
    a 2-way merge, and finding all the 2-way conflicts.
  - for each pair of corresponding, conflicting hunks, figure out the
    id of the last revision(s) that touched that hunk were.  (you can
    think of this as basically running "annotate" on each side.)
  - compare the revisions that touched each hunk to the list of
    revisions already in the ancestry of the other file.  this tells
    you whether each hunk contains "new stuff" compared to the other
    file.
    - if only one side of a conflict contains "new stuff", that side
      wins.
    - if both sides of a conflict contain "new stuff", that's a
      real conflict; escalate to user.
    - if neither side of a conflict contains "new stuff", that's what
      we call an "ambiguous clean merge" -- it looks like it should be
      a clean merge, in which _both_ sides win!  oops.  run in
      circles, scream and shout.  (this is rare, there are some
      strategies for dealing with it that I don't understand yet, and
      "precise" cdv-merge may avoid the problem entirely, we don't
      know yet.)

This glosses over some subtleties, like dealing with accidental
convergence, how you do the annotation, dealing with ambiguous clean,
etc., but for that you'll just have to start hanging out in #revctrl
on irc.freenode.net and tearing your hair out with the rest of us...
:-)

Note you can do it for tree rearrangements too; treat each file as
having a single-line chunk of data, containing its name, and a pointer
to its parent directory.  The tricky part about this in the monotone
case is figuring out which files in the left and right tree actually
have the same logical identity.  We thought we could do this by
tracing up to the 3-way merge common ancestor, but Richard case shows
that this is wrong.  We can still do it by some other tricks, to be
determined -- I think it works to define a "cutting set" as a set of
nodes in the graph such that any ancestral path from node A to node B
must pass through the set, find such a set when we want to merge,
trace up to each member of the set, and unify any files on the left
and right sides which trace to the same file in any of the cutting
nodes -- but then once we've figured out identity, we still have to
resolve possible conflicts, and again because of Richard's case, it
seems that we can't rely on a common ancestor to do that.  So we need
a merge algorithm that doesn't make any assumptions about graph
topology, like cdv-merge.

-- Nathaniel

-- 
When the flush of a new-born sun fell first on Eden's green and gold,
Our father Adam sat under the Tree and scratched with a stick in the mould;
And the first rude sketch that the world had seen was joy to his mighty heart,
Till the Devil whispered behind the leaves, "It's pretty, but is it Art?"
  -- The Conundrum of the Workshops, Rudyard Kipling

This email may be read aloud.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]