[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Commitinfo question

From: Paul Sander
Subject: Re: Commitinfo question
Date: Tue, 24 Oct 2000 10:57:11 -0700

I agree with Dennis' comments, but I would add a few things.  First, I
believe and recommend that there should be a differentiation between
"checked-in code" and "code that is eligible for the build."  There are
several reasons for this:  Developers should be encouraged to commit their
code early and often; a hand-off process process quantifies the changes
made between builds, making them easier to report; a hand-off process also
enables sharing of early code without having it prematurely affect builds
used by other parties, such as Q/A; adopting a task-oriented approach allows
for the addition and removal of specific features (within limits) in special

Developers should be encouraged to commit code early and often.  This is
somewhat obvious, because by making copies of the code under development
spreads risk in case of a system failure.  It's becoming more common for
IT departments to forego backing up individual workstations, favoring large
fileserver arrays instead.  Some people copy their work areas to a backed-up
stoarage area, but I believe that committing at the end of the day is better
practice if for no other reason than someone else can perform a checkout and
resume development if a colleague is hit by a bus.

It's also important to note that if work is distributed in such a way that
the directory trees that individual developers modify are somewhat isolated,
then branching is unnecessary if workgroups choose specific timestamps or
tags of known working code to update their sandboxes with.  Alternatively,
known good baseline builds can be replicated in a user's sandbox and

A hand-off process quantifies changes made between builds, making them
easier to report.  During the hand-off process, an inventory of the affected
files is taken, which in the end is a partial (or complete) difference between
two builds.  This difference typically takes the form of a set of files, each
file associated with a range of version numbers.  Tools such as rinfo can be
brought to bear with this information to produce meaningful change reports
upon completion of a build.  In addition, a tight integration between the
hand-off process and the defect tracking system can modify the status of the
defect to indicate that the developer is done.  A really good integration
can also record the exact files and version numbers implementing the repairs
and change the status of the defect, perhaps to indicate that the developer
is done but the fix is not yet available for testing.  (Subsequent updates
at the completion of the build specify a state where features are available
for testing.)

A hand-off process also enables sharing of early code without having it
prematurely affect builds used by other parties, such as Q/A.  Though CVS
provides branching capabilities to isolate work, developers frequently view
them as overhead and some have difficulty understanding them.  If developers
adopt a standard for code-sharing (which may be less rigorous than the
acceptance criteria for Q/A) then they may commit code meeting that standard
and permit colleagues to update their working areas with the new code.  This
formalizes the tried-and-true method of copying files out of other users'
environments, which is often used to circumvent formal, heavyweight
processes.  There is also has the bonus effect of avoiding merge conflicts
of identical changes between the sharing parties later.  The code becomes
real during the hand-off, which enforces the higher quality standard.

Adopting a task-oriented approach allows for the addition and removal of
specific features (within limits) in special circumstances.  By collecting
sets of files, each with a range of version numbers, there are now handles
for specific features implemented by each set.  By treating such sets of
files as units, specific features can be included or excluded from builds.
The decision for inclusion can be made via automation or review.  A form
of automation would be an assumption for inclusion into a build, which
subsequent removal after an analysis of a build failure.  A review could
be a defect triage in which a committee makes the decision to include or
exclude a specific feature in the product.

The sum of these features provide a very nice environment in which build
reports provide not only success/fail states and compilation error messages,
but also difference reports identifying specific changes to files and the
developers' commentary.  Defect databases are automatically updated to
indicate specific files and their versions that implement repairs, and
queries show whether or not specific features are available for testing.
(Or in the event of a build failure, defects can be returned to the
developer.)  But most importantly, the build success rate can be increased
(to as high as 100% success, depending on such factors as resource
availability, build duration and the sophistication of the backout mechanism)
while allowing the development teams to specify the commit criteria for their
code (which may be much lower than Q/A's acceptance criteria).

Search the archives for "submit/assemble" for more detailed discussion
of these capabilities.  This stuff has been implemented on top of CVS for
real production use, and these claims have all been borne out, with a hand-off
process that is highly automated and minimally intrusive to the developers.
(Note that the 100% success rate relies on two degenerate cases in which a
build is identical to its successful predecessor, because there may be no new
tasks, or because all new tasks have been backed out due to failures or
omitted after review.)

Other things to consider:  Minimize the set of uncontrolled sources
contributing to your builds; remember what you can't control, and check
those artifacts during subsequent builds or rebuilds; parameterize your
builds with data tables.

Minimize the set of uncontrolled sources contributing to your builds.
Your .cshrc file, the tools in /usr/local, the compiler, even the operating
system itself are uncontrolled  sources that will affect the way your
product will build.  A change to any one of them can break your build in
subtle ways that can be very difficult to repair.  This is especially true
when your product goes into the maintenance phase of the software life cycle.
Upgrading Gnu Make on your build system, for example, could render it
incapable of producing patches for past releases.  So put as much as you can
under source control, and keep your production build environment to a
minimum.  (This applies to your build script and its auxiliary tools as well.)

Remember what you can't control, and check those artifacts during subsequent
builds or rebuilds.  The OS on your build system affects your builds as
much as anything else.  A change to a header file can break your build in
fundamental ways.  Linking with new runtime libraries may make your product
unusable with older versions of the OS.  You can version-control your
equipment, but sometimes it's more convenient to keep big build servers
working on new development while using smaller ones for maintenance.  In
cases such as this, it's worthwhile to remember things like the OS version,
what value-add products are installed (e.g. compilers), what's mounted and
where, and so on.  Have your build script collect these metrics and check
them in.  Also have it compare the current metrics with the build's
predecessor (also checked in!) and issue warnings if there are differences.

Parameterize your builds with data tables.  This allows you to reduce your
tool set from a large collection of ad-hoc scripts to a small collection of
scripts that are easier to maintain and a large collection of data tables
that can be manipulated with relative ease.  This increases productivity
of an established build team and lowers the learning curve for new hires.
It also maximizes the reuse of skills while affording the amount of
flexibility that's needed to support the various types of builds that
are performed.  Features such as naming conventions, debug/optimization
switches, the locations of builds, the number of builds kept, inter-project
references, build hosts, environment variable settings, and notification
mailing lists are all candidates for such tables.

Take a look at where you might find
an interesting tool or two.  The rinfo and lmerge tools are useful for
generating difference reports.  The buildref tools are useful for automating
inter-project references.  The envctrl tools are useful for controlling
environment settings and other services.  These tools are pretty primitive
in themselves, but they provide some basic capabilities that enable an
environment like the one I just described.

--- Forwarded mail from address@hidden

Why not build on a nightly basis, instead of on a file-commit basis?
Compiling on every commit seems like tremendous overkill to me, not to
mention the possible performance degredation of the CVS server while it is
compiling your source code.  And multiple developers could commit files
simultaneously, which might cause multiple compiles to be executing at the
same time.  That could cause a real, serious drag on system performance.

I highly recommend that you create a script that runs nightly, at a time
when most (if not all) of the developers have gone home for the day.  The
script would check out all the source code and build everything from
scratch, and the script could run on any machine that has the appropriate

Doing your builds on a nightly basis has several advantages.  First, it is
possible that someone will check in only one or two files that are part of a
larger change, where several (or many) files would be required for the
entire change to be in effect.  If you try to build as soon as the first
file or first few files have been checked in, it is very likely that the
build will fail, and that build would have been a waste of time because you
knew it was probably going to fail.  On the other hand, if the developer
takes several hours to commit all of his/her changes -- no problem -- as
long as all the changes are committed by the time the nightly build kicks
off at whatever time you choose, the build should succeed.

Second, by building everything from scratch, you stand a much better chance
of catching errors in the code, not only at build time, but also the next
day when you have a full build of all of your applications ready for tesing
by your QA person(s).  Of course, if you have an automated test system, it
can also be kicked off by the script to test the programs, right after the
build finishes.  Then, when you come in the next morning, everything has
been built, tested, and ready for QA.  If there were any errors in the
build, have the script send an email to all the developers with a log of the
compiler's error messages.  The person who broke the build will probably
recognize the errors he/she made and realize that he/she forgot to check
something in, or knows exactly what to do to fix the build.  Then, that
night, the build should succeed (unless he, or someone else, has made
another mistake!).

And third, at night, it is less likely that anyone will be checking in any
code changes, so the machine will be free to compile and test your code
without having to do double duty both compiling and serving up developers'
CVS requests.  This will undoubtedly save your developers lots of
frustration during the day.

This process can help to make your development organization much more
streamlined.  I hope this helps answer your question, or at least gives you
some ideas.

----- Original Message -----
From: "Ryan Hennig" <address@hidden>
To: <address@hidden>
Sent: Monday, October 23, 2000 6:56 PM
Subject: Commitinfo question
> Hello,
> I am currently setting up an automated build and testing system that rides
> on top of CVS, using the commitinfo file to run compilations and tests on
> each commit.  I was just wondering if anyone on this list who has done this
> type of thing could share some wisdom with me.
> I am particularly interested in the following:
> - What kind of problems did you run into early on?
> - How was your system designed (generally)?
> - Did you run compilations and/or tests on the same box as the CVS server,
> or did you send the files off to another box (this is what I am trying to
> do)?
> - Is there a better way to do this besides using commitinfo?
> Any other input would be GREATLY appreciated.
--- End of forwarded message from address@hidden

reply via email to

[Prev in Thread] Current Thread [Next in Thread]