monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] Fw: address@hidden : Re: ANNOUNCE: Debian build scripts


From: Nathaniel Smith
Subject: [Monotone-devel] Fw: address@hidden : Re: ANNOUNCE: Debian build scripts on a public Monotone server]
Date: Thu, 28 Sep 2006 02:06:28 -0700
User-agent: Mutt/1.5.13 (2006-08-11)

I just posted the attached message to an interesting thread going on
over on comp.lang.ada:
  
http://groups.google.com/group/comp.lang.ada/browse_thread/thread/a326ac15995ef20e/bf2f9970828367bd#bf2f9970828367bd

Nothing in my post that will be new to this list, but forwarding it
since it might be handy to have around in the archives.

-- Nathaniel

-- 
"Lull'd in the countless chambers of the brain,
Our thoughts are link'd by many a hidden chain:
Awake but one, and lo! what myriads rise!
Each stamps its image as the other flies"
  -- Ann Ward Radcliffe, The Mysteries of Udolpho
--- Begin Message --- Subject: Re: ANNOUNCE: Debian build scripts on a public Monotone server Date: Thu, 28 Sep 2006 01:52:20 -0700 User-agent: G2/1.0
Georg Bauhaus wrote:
> I wasn't clear enough about what I mean by "email changeset".
> The same content that is needed by mtn sync
> could be passed in a mailbox style, decoupling operations.
> (Going from rendezvous to something else, in Ada terms.)
> I wasn't thinking of sending a standard patch in some email,
> I was thinking of using SMTP as a "Monotone sync packet wrapper".
> Monotone merging wouldn't have to be changed I think, if you follow
> the  "commit first" policy that Monotone recommends.

Merging is a bit of a red-herring here; the reason monotone uses
bidirectional communication is more subtle.

The simplest way to explain it is to imagine a hypothetical example.
Say you have some files on one or more remote computers, and you want
to make them match the files you have on a local computer.  (Basically
a simple mirroring problem.)  Because bandwidth is expensive, and you
perform this operation often, you can't just send a whole tarball over
there.  So you have basically two options.

One is to somehow guess what the other side has -- perhaps by
remembering what they had the last time you talked to them, plus
assuming they've received any changes you've sent since then.  With
this information, you can calculate some sort of diff that needs to be
applied to the remote files, and send that diff off into the aether.
Some possible problems: You might not have a convenient way to keep
track of what the other side has.  Even if you do, you might get it
wrong.  If you get it wrong, then you might send too little
information, making your patch useless, or you might send too much
information, making your patch wasteful.  You need some external
mechanism to recover from such loss-of-sync if it does occur.  Losing
sync is pretty darn easy -- all it takes is some patch getting lost in
the mail.

The other option is to use rsync, which does a bit of algorithmic
tricksiness (requiring bidirectional communication), and figures out
what changes need to be transmitted on the fly, for each remote host.
This is always reasonably efficient, and completely immune to
loss-of-sync issues.  If some connection gets lost, you just hit
up-arrow and run the same command again, it'll magically figure out how
to start up again where it left off.  If something happens to the
remote site -- maybe its disk gets wiped, maybe it gets
a bunch of information from someone else that you no longer have to
send yourself -- then no problem.

It is not a coincidence that monotone calls its network protocol
"netsync" :-).  The main algorithm we use isn't actually the rsync
algorithm (because we need to synchronize sets, not sequences), but it
has similar properties (and is actually asymptotically more efficient).
 Doing this just elimates whole swathes of failure modes and thus makes
the overall tool much more robust (and means that users get to spend
less time worrying about these sorts of obscure failure modes and how
to avoid them, too -- a regrettably common way to spend one's brain
cells when using traditional systems, I think...).

For a concrete example... say your project server's disk melts.
Solution: drop an empty db on a host somewhere, start serving out of
it.  The first person to use it will automatically push up whatever old
history they have available locally.  If they weren't quite up to date,
no problem; as everyone else hits the server in the normal course of
their work, they'll fill in whatever parts are missing.

It also helps that, since one generally stores all of the history
related to a project in a single database, when you get online you run
just one command to sync everything related to that
project.  (Compare to systems like mercurial, where each branch
generally needs to be pushed separately, to a separate destination --
assuming you remember which branches you
changed, and thus need to push.)  Also, syncing is always a legal
operation -- in particular, you never have to merge-before-push.
(Again, compare to systems like mercurial,
where you may have to do this.)  The biggest reason I know of to use
email is that we all already have a lot of infrastructure to make it
convenient, so even when you're offline you can just "queue and
forget".  But we've tried to make "hit the button and forget" about as
convenient.

-- Nathaniel
(I don't normally read this group, so CC's on replies appreciated.)



--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]