monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Re: test mail


From: Torsten Rueger
Subject: [Monotone-devel] Re: test mail
Date: Tue, 3 Feb 2004 09:20:38 +0200

On 3 Feb 2004, at 00:26, graydon hoare wrote:

Is there more details on the merging algorithm that the faq refers too ?

 so I can give a rundown here:

So now there is. Good summary, thanks. I'd suggest to paste into some file (code, note or readme.diff)

Not that I understand all of it, but I get an idea. I work on a research project about synchronisation and we use a similar approach (immutable item, but not signed). It's come as a suprise to me how similar the version control problem is to the synchronisation one. (http://pdis.hiit.fi) **

1 - given 2 unmerged leaves in the manifest ancestry graph, work out
    which "common ancestor" to merge them via.
We put the parent list into the item (compacted, but that's only possible because we have "sane" ids)

4 - merge the deltas. this involves dropping down to the file level and
    computing an edit list for each file on the left and right edges,
    line-wise, then converting the edit lists to vectors of extents
    which map coordinates from the ancestor into left and right
    coordinates, normalizing the extent vectors, and merging them.

This is where it gets interesting to me. I have to do that in our project, just for xml.

these steps have seen quite a bit of churn over the past 8 months,
I bet.
tromey keeps finding testcases which break my existing algorithms :)
Good on him. Where are those, I might have a look at them.

Torsten

** If you're interested in how close these topics are, and how our synchronisation compares, read the attached summary. It's from our WiKi which isn't public. Otherwise ignore it.
PDIS Wiki   VersionControlLessons Torsten Rueger
 

FrontPage » Arch » TitleIndex » PdisOnSubversion » VersionControlLessons

1 Learn from Version Control

While SynchronisationIsVersionControl, aims at sort of "proving" that the two problems of Version Control and Synchronisation may be viewed as very similar, this follows on to see what can be learned from the Version Control area. In other words, while Pdis has previously looked towards other Synchronisation projects to learn, here we try to see what version control may teach.

2 Version Control Systems

Version control is a large area. It's been around for ages. There are more systems than can be evaluated, but Pdis being a Research project, we'll just eliminate the commercial ones for lack of transparency. CVS has been a long king of the area, but through it's shortcomings has produced a number of new attempts in the area, some of which fit our purposes quite well.

Off the ones I found, I'll examine the following in more detail:

  • Darc is based on change sets and also assumes a flat peer model
  • Arch is also based on change sets, but has the best branch/merge support around. Also excellent networking.
  • Monotone uses hash signed items, secure certificates and a flat peer model
  • Subversion is a project out to replace CVS. It's biggest point is tree based versioning and it's somewhat explained in PdisOnSubversion

    An interesting overview can be found [WWW]here, tough it does not include Darc.

3 Pdis core concepts

Just to see what we're even looking for, let's see how Pdis defines itself and how that relates to Version Control systems: http://pdis.hiit.fi/pdis/plan-2004/

3.1 Device-to-Device Architecture

  • Traditionally Version Control Systems (like most Synchronisation systems) are client/server. But newer ones do have "multi-site" or peer to peer features.
  • Monotone stands out in that it has a flat peer-to-peer model. It just propagates versions, much like Pdis.
  • Darcs and Arch assume a repository per developer and propagate changes.
  • Subversion has a central repository, but good branching support.

3.2 Immutable Object Versions

That is really the core of a version control System, and the basis of the comparison. All Version control systems "do" this on a logical level. If not at a storage level. But really, what is done at the storage level is a detail, as it is also in Pdis not necessary to actually store (and never modify) all versions. Differencing and delta encoding can be used.

But the "level" of change is more recently the repository, not a single item.

  • Monotone implements exactly this concept. Even signs the versions to make sure they are immutable.
  • Arch and Darcs really don't know items, but only changes and their application. The changes are immutable and allow reconstruction of items.
  • Subversion implements the repository as immutable. Any edit creates a new repository. For single item changes, as in Pdis, this means immutable items.

3.3 Stateful and Stateless Synchronisation

Version control systems traditionally, in the client/server case, only support Stateful synchronisation. Meaning the client knows the last synchronised state of the partner. The concept of "pushing" changes is not common in Version control, but may often be implemented by hook scripts. Also frequent polling may be used.

  • Monotone and Darcs have built in push commands.
  • Subversion and Arch only allow the hook approach

3.4 Postponable Conflict Resolution

This is also a very common feature of Version Control Systems. They usually merge trivial changes themselves and let the user resolve non trivial merges (conflicts). As outlined in SynchronisationIsVersionControl, Several versions may be mapped to branches. The question of how to avoid conflicts from multiple concurrent merge attempts remains and should be looked into (also in Pdis).

  • Monotone implements the exact same idea as Pdis, distributing changed items, attempting trivial merges, but leaving concurrent versions of the same object in place if that is not possible.
  • Darc and Arch (especially) propagate change-sets and have good merge support. Especially Arch has enough information available to make a peer-to-peer update (or cyclic change propagation) possible.

3.5 Application-Independent Repository

All Version Control Systems are implemented like this. Usually with a choice of access methods/protocols.

It is interesting to see that many, especially the distributed systems (Monotone, Darc, Arch) use dumb servers, i.e. ftp, sftp, http to avoid set-up (ownership) problems.

3.6 Queries

Pdis actually introduces a query language for it's data. This is not necessary for the pure synchronisation as can be seen from the fact that not all (or is it any?) other Synchronisation have it. It is also a feature not found in Version control Systems. It can bee added or performed locally, but does present a different way of thinking.

It may be noted that searching and structuring can be seen as equivalent for a primary search. Given that PIM data is not that big, structure may be an alternative for Pdis too .

4 Summary

Version control is an active field which is conceptually very close to Synchronisation. Much may be learned from it. Regarding Pdis we note:

  • dumb servers are a sought options by many distributed system for their ease of set-up.
  • self signed items has great benefits and can be made to work (Monotone)
  • Query and Structure may be seen as equivalent choices. Pdis choose query, version control structure (which is always implemented at repository level, not directory)
  • There is benefit in calling an item a file, it's easy to understand. There is little benefit in the restriction to XML, other than in regard to the choice about querying.
  • Many of the system could be used to base a Pdis implementation on, specifically

Not regarding the access protocol and noting that all systems could Pdis sync, while they have their own already:

  • Subversion has an excellent versioning file-system storage solution that may serve as a Pdis storage.
  • Monotone is conceptually closest to Pdis, using immutable items. Pdis sync may be added beneficially to the already present dumb servers.
  • Arch has good merge support already. It has enough information to do cyclic updates, but doesn't implement it.
  • Darc has the benefit of not distinguishing the checkout from the repository.


PythonPowered

reply via email to

[Prev in Thread] Current Thread [Next in Thread]