gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnu-arch-users] Re: arch lkml


From: Pau Aliagas
Subject: [Gnu-arch-users] Re: arch lkml
Date: Fri, 26 Sep 2003 01:11:14 +0200 (CEST)

On Thu, 25 Sep 2003, Tom Lord wrote:

>     > From: Andrea Arcangeli <address@hidden>
> 
>     > On Thu, Sep 25, 2003 at 11:15:33AM -0600, Eric W. Biederman wrote:
> 
>     >> poorly and is not distributed.  SVN is not distributed.  ARCH is
>     >> barely distributed and architecturally it makes distributed merging
>     >> hard.  [..]
> 
> I think (Eric) that you'll find that many of us believe pretty much
> the opposite of what you say about arch.    That raises the
> interesting question:  what the heck do you mean?
> 
> (If we're missing something fundamental, let's hear about it  --- if
> you've misunderstood something, perhaps we can clear that up.)
> 
>     > actually it seems I can use it for my tree too, after I modify it
>     > to be able to tag into a plain source tree forked with hardlinks
>     > (something I certainly couldn't do with b*tkeeper), and a way to change
>     > the patchsets internally (and a way to extract all the patches ordered
>     > with meaningful names for marcelo and other trees not based on
>     > arch).
> 
> I don't quite follow you (Andrea) there.   Perhaps you could explain further.

I think I understand what he means. He wants to develop in a local tree. 
You can do that. usualyy, once the archive (repo) is created, you "tla 
get" your project. But if you want to create copies with hardlinks tosave 
space, you can do that too, I don't see a problem.

The automatic extraction of patches from CVS, as Denys Duchier suggested,
can be done with his tool. I was convinced that it was with Miles' tools.
Sure they'll correct my misunderstandings.

>     > This my proposal to change arch to be able to hook into a random tree in
>     > the filesystem (instead of a base-0 tarball), could radically change the
>     > way of doing distributed development, in term of resource savings (I bet
>     > that would be an order of magnitude better than bk too, I mean, I've
>     > lots of tarballs open anyways here, since various trees starts on top of
>     > official tar.gz packages, not bkcvs, I so I could save gigabytes of
>     > space and dcache efficiency with that feature, even across the non arch
>     > usages that are the most common to me at the moment). The hardlinks will
>     > make an huge difference and it'll be natural the way arch is designed to
>     > take the best advance of them as soon as we can hook into an unpacked
>     > tar.gz.
>
> Again, I don't _quite_ follow you however, perhaps this is relevent:

I think I answered it in the first paragraph. Once there is a repo, having 
multple trees hardlinked in the filesystem can be done without affecting 
anything. Make sure you copy-on-write if you want changes only in one 
tree. if the tree belong to the same project, you'll have to replay the 
changes to commit if there are patches missing.

In fact, revision libraries are built this way, hardlinking to the
previous version unless the file has changed.

To hook to a source tree you first have to import it into the repo. That's 
a matter of a few commands. I do it for many projects and doesn't 
represent a problem for me. Keeping up to date means importing the changes
or just unpacking the next tar and tagging a new release.

> You can use `mkpatch/dopatch' (the arch changeset tools) more or less
> independently of using arch.    `dopatch', in particular, "does the
> right thing" when patching a tree that is the clone of some other tree
> created using hardlinks.
> 
>     > arch looks infact more an 'archive for patches' than anything
>     > else. 
> 
> Archive, cataloging system, and _set_of_tools_for_manipulating_ in
> various useful ways, yes.
> 
>     > But in terms of automated "merging" design between two separate
>     > distributed trees ala b*tkeeper, the problem after you run into rejects
>     > is just unsolvable better than 'arch' already does, as far as I can
>     > tell. I don't see how b*tkeeper can do better just because I can't
>     > imagine anything better possible to do during merging (I can't use bk
>     > myself so I can't know if it does it differently). And not even
>     > b*tkeeper can know that the merging went right like an human can do,
>     > there's no way it can understand the sematics of the code during merging
>     > (actually defined as star-merging as from the arch specifications). If
>     > one product does better than the other given the same development
>     > simulation, it could be simply that its heuristics are less strict (i.e.
>     > simular to diff -u0 vs diff -u vs diff -u10)
> 
> I really don't know much about bk's merge operators -- but it is
> highly likely that (syntactic trivialities and GUIs aside) they come
> out to be a subset of what you can do with arch.   Andrea is right
> with his suggestion (my paraphrase) that we've nailed that space.
> 
> (The syntactic triviality: conflict mark-ups that look like unified
> diffs, is in the patch-queue.   GUIs: I think we have all the
> arch-side bits lined up and now its mostly a matter of someone making
> the interface to any of the existing graphical merge tools fit
> smoothly into use.)
> 
> 
>     > Then there are quite tons of issues with the on disk format, I feel the
>     > gzip slows down things, I didn't try to remove it, but that's my
>     > feeling. Those are small patches, and small files, a tar is sure a good
>     > idea, but gzip I think should be optional. 
> 
> They reduce network traffic in a natural way.   In purely local
> set-ups, we have not seen any evidence that they impact performance in
> any way worth worrying about.

I'd like to have bz2 :) Anywat, this is a minor issue that could be solved 
with a small hacking and a config parametre compress=[no|gzip|bzip2]

>     > About the inventories I would simply go forcing the extended mode, that
>     > force the equivalent of cvs add on every file archived, this is much
>     > prefereable IMHO because in a big thing like the kernel there's lots of
>     > garbage with all sort of names, and so the commit operation could never
>     > get right which files to checkin and which not. It's much easier to
>     > consider only the files explicitly added to the repository with an
>     > 'arch' command during the commit/update/reply etc...
> 
> Do you mean `explicit' mode?   I think it can be configured to do
> pretty much what you want.

Yes, explicit, read the other mail.

>     > About other misc comments I don't see:
> 
>     > 1) the difference between update/reply/merge-star
> 
> "star-merge".   

update undoes your changes, re-applies the missing patchsets and reapplies 
your tree changes back.

replay applies the missing patches on the top of your tree.

star-merge tries to do the best :)

>     > 2) why there isn't an automated make-log + vi + commit command

make an alias. It's only a matter of getting used to it, but documenting 
changes as you go becomes addictive.

>     > 3) why the +/= in front of the names

To avoid conflicts. You can always specify --log FILE to commit.

>     > About the {arch} directory, using the { is a great idea to reduce
>     > dramatically the namespace pollution. I've never seen a { in a directory
>     > before ;)
> 
>     > I also had a problem with cvs2arch, it worked fine for a repository (not
>     > the kernel, a small one) but not for another. It happened on a directory
>     > that didn't exist in the original cvs import, then it was created over
>     > time. At the time the directory is created in cvs, cvs2arch fails and
>     > says the directory doesn't exist in the arch data directory.
> 
> I wonder if the new gateway mechanisms BM is working on provide an
> alternative worth considering.

BM provides a CVS gateway, usually up to date and eror free, but NOT 
always.

Pau






reply via email to

[Prev in Thread] Current Thread [Next in Thread]