gnu-arch-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnu-arch-users] Re: Working out a branching scheme [was: tag --seal


From: Tom Lord
Subject: Re: [Gnu-arch-users] Re: Working out a branching scheme [was: tag --seal --fix]
Date: Sat, 3 Apr 2004 07:28:44 -0800 (PST)

    > From: Dustin Sallings <address@hidden>

    > On Apr 2, 2004, at 10:26, Tom Lord wrote:

    >> 11k revisions in a year comes out to a bit more than one per hour, 24
    >> hours a day, 365.25 days.  No development process that is doing that
    >> in a single arch version is doing anything useful.  (A process doing
    >> that across a few (coallescing) branches, on the other hand, is
    >> entirely realistic.)

    >   Except this was the sum of the work from 15 extremely productive 
    > developers over about 16 months.  For about nine of those months, it 
    > was four of us in a room closed off to the world pampered in every way 
    > we could be to produce the ideal code generation machine.

All of this has to be prefaced by wondering how accurately cscvs is
counting in this case but.....

11k commits/16 months comes out to an average of of just a tiny bit
less than one commit per hour --- if y'all were working non-stop, 24
hours/day, every day, for the entire 16 months.

More realistically but still a stretch (10 hour days), that means that
you were doing mainline integration (from uncontrolled sources, no
less) on average of every half hour.  

I presume that the commit rate was not constant across the shift from
4 to 15 programmers --- presumably with 15 you had mainline
integrations going on at well under 30 minute intervals.   (And this
is one reason why I have some doubts about the way cscvs is counting.
It's hard to fit your numbers into a realistic picture of a project.)

Such rapid mainline integrations in turn strongly suggests (especially
given the file-oriented nature of CVS) that much of the time, your
developers were developing against a random tree state that wasn't
up-to-date with any actual baseline: were programmers _really_
bringing their tree up to date between commits?  Testing against the
up-to-date source and the committing?  Your numbers suggest they
wouldn't have had time.  I wonder how many times developers wound up
with sufficiently confused working dir states that they just checked
out fresh trees and migrated uncommitted work "by hand"?

At that commit rate, you couldn't even reliably CVS-tag a baseline
without a shop-wind "hands off keyboard" step -- odds would otherwise
be high that the tag operation would overlap with some commits.

All I'm saying is: yes, use a centralized archive if that's what you
want; yes, sustain that commit rate; but do more on branches and toss
some more careful integration steps in the cycle there.  Your
productivity will go _up_.

It would have been interesting to have someone study your process as
it actually happened.   It's not too late to do some post-mortem:

For example, it would be interesting to look at those 11k commit
points and (I'm assuming this was a programming project) see what
percentage of them left the tree in a compilable state or one that
passed tests.

I can only speak in generalities having not been the room with the 4
or 15 of you, but in my experience and just based on common sense,
wouldn't the "dog pile on the mainline" style you were using _create_
inefficiencies?  That is, it creates a needless interdependency and
interference among members of the team.  If hacker Alice needs to
prepare a demo or run a test on the official mainline, hackers Bob
.. Olivia have to cease committing for a while or at least restrict
what changes they make.  And, often, Alice's _first_ job in preparing
the demo or test is going to be "get the thing running".  Of course,
Alice could just blow off revision control and demo or test an
anonymous, uncontrolled tree but that is not a recommended procedure.

I'm not saying it can never work.  "Dog pile" is a venerable technique
that has _often_ worked.  But it's unreliable, non-reproducible, and
disaster-prone.  It's flaky.  It's one of the techniques that revision
control is supposed to put to bed.

GCC is an interesting case.  As I recall, it comes out to having many
more committers (but roughly the same number of "most active"
committers) and a lower commit rate.  But there's a couple of catches:
First, they are mostly working with CVS as a proper integration branch
with actual work taking place elsewhere; they have fairly strict
integration policies.  It's not people just bashing on a shared tree.
Second, although CVS only barely recognizes the fact, GCC partitions
into 23 sub-projects which, in arch, would most likely be separate
categories.

The GCC mainline commit rate is sufficiently high that the integration
policies involve distributed testing.  That is: they (slightly) relax
the contraint that the mainline increases monotonically in quality
but, at the same time, they keep a (weakly) changeset-oriented
approach to modifying mainline so that, usually, when a given patch
breaks something, the fault can be isolated for that patch which
remains available in totality:  exactly the kind of arrangement that
arch automates well if you use branches.

So with GCC, even at your commit rate:

  a) The moment to moment work would be in (usually remote) branches.

  b) The integration commits would be distributed (non-linearly, I
     presume) among 23 categories

  c) Over a 16 month period, tossing a couple of archive cycles in
     there would not be disruptive.

There's no way you'd need to or want to have an 11k revision single
version for GCC: a project who's mainline management is in the same
ballpark-of-scale as the project you describe (albeit with the
equivalent of at least dozens of additional non-committer programmers
actually authoring the incoming changes).


-t





reply via email to

[Prev in Thread] Current Thread [Next in Thread]