[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 0/4] faster gnulib-tool
From: |
Ralf Wildenhues |
Subject: |
[PATCH 0/4] faster gnulib-tool |
Date: |
Sun, 28 Dec 2008 11:15:32 +0100 |
User-agent: |
Mutt/1.5.18 (2008-05-17) |
Hello, and I hope you're all having or have had nice holidays,
here's a short patch series to speed up common gnulib-tool usage a bit:
1) cache module metainformation.
The first observation is that a bulk of the forks in a typical --update
are spent for 'sed' parsing the module metainformation files. So let's
cache them: contents are parsed into shell variables.
The cache variable names consist of 'c_' plus the flattened module name.
For Bash, the function to flatten the name uses ${var//subst/repl} to
avoid forking for module names that contain non-alphanumeric characters
(such as '/').
FWIW, the values of $lookedup_file and $lookedup_tmp are not cached;
doing so would, if --local-dir were used and module files patched,
require that the patched files be kept (and not overwritten) for the
duration of the script. I have checked that no caller site uses the
lookedup_{file,tmp} values for the module metainformation files, so
we don't have to worry about this.
By itself, this patch does not help much but even slows down gnulib-tool
(see timings below), because a lot of the module file reading happens in
subshells, failing to populate the parent shell's cache.
2) avoid forks with func_get_* functions.
This patch turns (1) into a speed boost, by eliminating lots of forks
related to calls of the func_get_* functions, thus allowing the cache
to be used a few times in a typical --update or --test operation.
(Of course the additional fork elimination itself also helps. :-)
3) abort loops early where possible.
A couple of loops only test for presence of some condition, but have no
other side condition; they can be aborted as soon as we have a definite
answer.
4) faster string handling for Posix shells.
This introduces a shell function for splitting off literal prefixes and
suffixes from strings, avoiding 'sed' when the shell is Posixy enough
(idea copied from Libtool).
I have tested the changes with M4, using 'gnulib-tool --test', and on a
couple of other packages using gnulib, and ensured that the only changes
they cause is some harmless removed empty lines in generated files.
Testing was done on GNU/Linux using bash and pdksh, and Solaris ksh.
The patches are posted using 'git format-patch' so they can be fed
directly into 'git am', for those so inclined.
OK to apply?
The whole series gives me about 50% improvement for
gnulib-tool --update
on the git M4 tree:
before: 21.63 s
after 1: 27.46 s
after 2: 16.46 s
after 3: 12.94 s
after 4: 10.83 s
With
gnulib-tool --with-tests --test
there is about 20% improvement (a couple of minutes), but note that
this also runs the other autotools, configure and make. (1) and (2)
slightly slow down things like
gnulib-tool --extract-description ...
but since these modes are typically faster than the other modes,
I consider that an acceptable trade-off. Otherwise, one could also
reorganize gnulib-tool a bit so that it can use one script on all
modules in question, like
sed "$sed_extract_license_only" $modules
Doing that throughout the code (i.e., also for --update) would need
more intrusive changes, though.
Thanks,
Ralf
- [PATCH 0/4] faster gnulib-tool,
Ralf Wildenhues <=