FYI: add Tom's ``Dependency Tracking in Automake'' doc to the manual

automake-patches
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
FYI: add Tom's ``Dependency Tracking in Automake'' doc to the manual

From:	Alexandre Duret-Lutz
Subject:	FYI: add Tom's ``Dependency Tracking in Automake'' doc to the manual
Date:	Wed, 15 Sep 2004 22:15:45 +0200
User-agent:	Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3.50 (gnu/linux)
This start the `History' chapter of the manual, which I'll have
to furnish before Sunday.

The "Dependency Tracking Evolution" node is just a Texinfoized
version of http://sources.redhat.com/automake/dependencies.html
so I can @xref to it easily from the upcoming history.  Some of
the original text is out-of-date, I'll update it in the next
commit.

2004-09-15  Alexandre Duret-Lutz  <address@hidden>

        * doc/automake.texi (History): New node.
        (Dependency Tracking Evolution): New node, filled with a Texinfo
        version of Tom Tromey's ``Dependency Tracking in Automake''
        document, initially published on the Automake homepage on
        2001-06-29.

Index: doc/automake.texi
===================================================================
RCS file: /cvs/automake/automake/doc/automake.texi,v
retrieving revision 1.48
diff -u -r1.48 automake.texi
--- doc/automake.texi   3 Aug 2004 23:02:54 -0000       1.48
+++ doc/automake.texi   15 Sep 2004 20:09:07 -0000
@@ -109,6 +109,7 @@
 * API versioning::              About compatibility between Automake versions
 * Upgrading::                   Upgrading to a Newer Automake Version
 * FAQ::                         Frequently Asked Questions
+* History::                     Notes about the history of Automake
 * Copying This Manual::         How to make copies of this manual
 * Indices::                     Indices of variables, macros, and concepts
 
@@ -257,6 +258,10 @@
 * renamed objects::             Why are object files sometimes renamed?
 * Multiple Outputs::            Writing rules for tools with many output files
 
+History of Automake
+
+* Dependency Tracking Evolution::  Evolution of Automatic Dependency Tracking
+
 Copying This Manual
 
 * GNU Free Documentation License::  License for copying this manual
@@ -7874,6 +7879,285 @@
 portable, but they can be convenient in packages that assume GNU
 @command{make}.
 
address@hidden History
address@hidden History of Automake
+
+This chapter presents various aspects of the history of Automake.  The
+exhausted reader can safely skip it; this will be more of interest to
+nostalgic people, or to those curious to learn about the evolution of
+Automake.
+
address@hidden
+* Dependency Tracking Evolution::  Evolution of Automatic Dependency Tracking
address@hidden menu
+
address@hidden Dependency Tracking Evolution
address@hidden Dependency Tracking in Automake
+
+Over the years Automake has deployed three different dependency
+tracking methods.  Each method, including the current one, has had
+flaws of various sorts.  Here we lay out the different dependency
+tracking methods, their flaws, and their fixes.  We conclude with
+recommendations for tool writers, and by indicating future directions
+for dependency tracking work in Automake.
+
address@hidden First Take
address@hidden Description
+
+Our first attempt at automatic dependency tracking was based on the
+method recommended by GNU @command{make}.
+
+This version worked by precomputing dependencies ahead of time.  For
+each source file, it had a special @file{.P} file which held the
+dependencies.  There was a rule to generate a @file{.P} file by
+invoking the compiler appropriately.  All such @file{.P} files were
+included by the @file{Makefile}, thus implicitly becoming dependencies
+of @file{Makefile}.
+
address@hidden Bugs
+
+This approach had several critical bugs.
+
address@hidden
address@hidden
+The code to generate the @file{.P} file relied on @code{gcc}.
+(A limitation, not technically a bug.)
address@hidden
+The dependency tracking mechanism itself relied on GNU @command{make}.
+(A limitation, not technically a bug.)
address@hidden
+Because each @file{.P} file was a dependency of @file{Makefile}, this
+meant that dependency tracking was done eagerly by @command{make}.
+For instance, @code{make clean} would cause all the dependency files
+to be updated, and then immediately removed.  This eagerness also
+caused problems with some configurations; if a certain source file
+could not be compiled on a given architecture for some reason,
+dependency tracking would fail, aborting the entire build.
address@hidden
+As dependency tracking was done as a pre-pass, compile times were
+doubled--the compiler had to be run twice per source file.
address@hidden
address@hidden dist} re-ran @command{automake} to generate a
address@hidden which did not have automatic dependency tracking (and
+which was thus portable to any version of @command{make}).  In order to
+do this portably, Automake had to scan the dependency files and remove
+any reference which was to a source file not in the distribution.
+This process was error-prone.  Also, if @code{make dist} was run in an
+environment where some object file had a dependency on a source file
+which was only conditionally created, Automake would generate a
address@hidden which referred to a file which might not appear in the
+end user's build.  A special, hacky mechanism was required to work
+around this.
address@hidden itemize
+
address@hidden Historical Note
+
+The code generated by Automake is often inspired by the
address@hidden style of a particular author. In the case of the first
+implementation of dependency tracking, I believe the impetus and
+inspiration was Jim Meyering.  (I could be mistaken.  If you know
+otherwise feel free to correct me.)
+
address@hidden Dependencies As Side Effects
address@hidden Description
+
+The next refinement of Automake's automatic dependency tracking scheme
+was to implement dependencies as side effects of the compilation.
+This was aimed at solving the most commonly reported problems with the
+first approach.  In particular we were most concerned with eliminating
+the weird rebuilding effect associated with make clean.
+
+In this approach, the @file{.P} files were included using the
address@hidden command, which let us create these files lazily.  This
+avoided the @code{make clean} problem.
+
+We only computed dependencies when a file was actually compiled.  This
+avoided the performance penalty associated with scanning each file
+twice.  It also let us avoid the other problems associated with the
+first, eager, implementation.  For instance, dependencies would never
+be generated for a source file which was not compilable on a given
+architecture (because it in fact would never be compiled).
+
address@hidden Bugs
+
address@hidden
address@hidden
+This approach also relied on the existence of @command{gcc} and GNU
address@hidden (A limitation, not technically a bug.)
address@hidden
+Dependency tracking was still done by the developer, so the problems
+from the first implementation relating to massaging of dependencies by
address@hidden dist} were still in effect.
address@hidden
+This implementation suffered from the ``deleted header file'' problem.
+Suppose a lazily-created @file{.P} file includes a dependency on a
+given header file, like this:
+
address@hidden
+maude.o: maude.c something.h
address@hidden example
+
+Now suppose that the developer removes @file{something.h} and updates
address@hidden so that this include is no longer needed.  If he runs
address@hidden, he will get an error because there is no way to create
address@hidden
+
+We fixed this problem in a later release by further massaging the
+output of @command{gcc} to include a dummy dependency for each header
+file.
address@hidden itemize
+
address@hidden Dependencies for the User
address@hidden Description
+
+The bugs associated with @code{make dist}, over time, became a real
+problem.  Packages using Automake were being built on a large number
+of platforms, and were becoming increasingly complex.  Broken
+dependencies were distributed in ``portable'' @file{Makefile.in}s,
+leading to user complaints.  Also, the requirement for @command{gcc}
+and GNU @command{make} was a constant source of bug reports.  The next
+implementation of dependency tracking aimed to remove these problems.
+
+We realized that the only truly reliable way to automatically track
+dependencies was to do it when the package itself was built.  This
+meant discovering a method portable to any version of make and any
+compiler.  Also, we wanted to preserve what we saw as the best point
+of the second implementation: dependency computation as a side effect
+of compilation.
+
+In the end we found that most modern make implementations support some
+form of include directive.  Also, we wrote a wrapper script which let
+us abstract away differences between dependency tracking methods for
+compilers.  For instance, some compilers cannot generate dependencies
+as a side effect of compilation.  In this case we simply have the
+script run the compiler twice.  Currently our wrapper script knows
+about twelve different compilers (including a "compiler" which simply
+invokes @command{makedepend} and then the real compiler, which is
+assumed to be a standard Unix-like C compiler with no way to do
+dependency tracking).
+
address@hidden Bugs
+
address@hidden
address@hidden
+Running a wrapper script for each compilation slows down the build.
address@hidden
+Many users don't really care about precise dependencies.
address@hidden
+This implementation, like every other automatic dependency tracking
+scheme in common use today (indeed, every one we've ever heard of),
+suffers from the ``duplicated new header'' bug.
+
+This bug occurs because dependency tracking tools, such as the
+compiler, only generate dependencies on the successful opening of a
+file, and not on every probe.
+
+Suppose for instance that the compiler searches three directories for
+a given header, and that the header is found in the third directory.
+If the programmer erroneously adds a header file with the same name to
+the first directory, then a clean rebuild from scratch could fail
+(suppose the new header file is buggy), whereas an incremental rebuild
+will succeed.
+
+What has happened here is that people have a misunderstanding of what
+a dependency is.  Tool writers think a dependency encodes information
+about which files were read by the compiler.  However, a dependency
+must actually encode information about what the compiler tried to do.
+
+This problem is not serious in practice.  Programmers typically do not
+use the same name for a header file twice in a given project.  (At
+least, not in C or C++.  This problem may be more troublesome in
+Java.)  This problem is easy to fix, by modifying dependency
+generators to record every probe, instead of every successful open.
+
address@hidden
+Since automake generates dependencies as a side effect of compilation,
+there is a bootstrapping problem when header files are generated by
+running a program.  The problem is that, the first time the build is
+done, there is no way by default to know that the headers are
+required, so make might try to run a compilation for which the headers
+have not yet been built.
+
+This was also a problem in the previous dependency tracking implementation.
+
+The current fix is to use @code{BUILT_SOURCES} to list built
+headers.  This causes them to be built before any other other build
+rules are run.  This is unsatisfactory as a general solution, however
+in practice it seems sufficient for most actual programs.
address@hidden itemize
+
+This code has not yet been in an official release of Automake.  So,
+while it has seen some testing, it has not been stressed the way that
+the other implementations were.  The most serious problems probably
+remain unknown.
+
address@hidden Techniques for Computing Dependencies
+
+There are actually several ways for a build tool like Automake to
+cause tools to generate dependencies.
+
address@hidden @asis
address@hidden @command{makedepend}
+This was a commonly-used method in the past.  The idea is to run a
+special program over the source and have it generate dependency
+information.  Traditional implementations of @command{makedepend} ere
+not completely precise; ordinarily they were conservative and
+discovered too many dependencies.
address@hidden The tool
+An obvious way to generate dependencies is to simply write the tool so
+that it can generate the information needed by the build tool.  This is
+also the most portable method.  Many compilers have an option to
+generate dependencies.  Unfortunately, not all tools provide such an
+option.
address@hidden The file system
+It is possible to write a special file system that tracks opens,
+reads, writes, etc, and then feed this information back to the build
+tool.  @command{clearmake} does this.  This is a very powerful
+technique, as it doesn't require cooperation from the
+tool.  Unfortunately it is also very difficult to implement and also
+not practical in the general case.
address@hidden @code{LD_PRELOAD}
+Rather than use the file system, one could write a special library to
+intercept @code{open} and other syscalls.  This technique is also quite
+powerful, but unfortunately it is not portable enough for use in
address@hidden
address@hidden table
+
address@hidden Recommendations for Tool Writers
+
+We think that every compilation tool ought to be able to generate
+dependencies as a side effect of compilation.  Furthermore, at least
+while @command{make}-based tools are nearly universally in use (at
+least in the free software community), the tool itself should generate
+dummy dependencies for header files, to avoid the deleted header file
+bug.  Finally, the tool should generate a dependency for each probe,
+instead of each successful file open, in order to avoid the duplicated
+new header bug.
+
address@hidden Future Directions for Automake's Dependency Tracking
+
+In GCC 3.0, we managed to convince the maintainers to add special
+command-line options to help Automake more efficiently do its job.  We
+hoped this would let us avoid the use of a wrapper script when
+Automake's automatic dependency tracking was used with @command{gcc}.
+
+Unfortunately, this code doesn't quite do what we want.  In
+particular, it removes the dependency file if the compilation fails;
+we'd prefer that it instead only touch the file in any way if the
+compilation succeeds.
+
+Nevertheless, in a future (probably minor) release of Automake we hope
+to @command{make} it so that if @command{gcc} is detected at
address@hidden time, then we will inline the
+dependency-generation code and not require the use of a wrapper
+script.  This will make compilations that much faster for those using
+this compiler (probably our primary user base).
+
+Currently, only languages and compilers understood by Automake can
+have dependency tracking enabled.  We would like to see if it is
+practical (and worthwhile) to let this support be extended by the user
+to languages unknown to Automake.
+
 @c ========================================================== Appendices
 
 @page
-- 
Alexandre Duret-Lutz
[Prev in Thread]
Current Thread
[Next in Thread]
FYI: add Tom's ``Dependency Tracking in Automake'' doc to the manual, Alexandre Duret-Lutz <=
Prev by Date: Re: [PATCH] `compile' with spaces in arguments
Next by Date: FYI: update Dependency Tracking Evolution
Previous by thread: FYI: AC_PROG_CC never after AM_PROG_CC_C_O
Next by thread: FYI: update Dependency Tracking Evolution
Index(es):
- Date
- Thread