autoconf
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parallelization of ./configure compiler test processes


From: Thomas Jahns
Subject: Re: parallelization of ./configure compiler test processes
Date: Thu, 30 Mar 2023 01:05:36 +0200

Hello Danny,

I spent some time thinking about improvements to autoconf configure scripts 
(while waiting for builds to proceed). In my view, it is currently still easier 
to seek small efficiency gains that, in sum, could still improve run-time 
substantially than parallelizing the whole would be, because there is so much 
often untapped potential:

* Use bash builtin shell commands to fork and especially exec less. In modern 
systems with comparatively fast data paths compared to anything that affects 
resource control, like changing memory mappings, dropping caches etc., syscalls 
can be a substantial source of slow down.
* Use TMPDIR to prevent temporary files from hitting disk (use /dev/shm or 
similar instead of /tmp)
* In the case of spack I've seen substantial improvements from running the 
whole build in /dev/shm and think spack should at least mention the possibility 
for systems with substantial amounts of RAM (and guess what kind of system many 
sites using spack just happen to have?).
* The gcc option -pipe is similarly helpful to get build phases to start as 
soon as possible.

I'm writing this, because I feel that quite a few bright minds went for the 
all-or-nothing goal of successful parallelization only to end up with 
something, if at all, that did not make it into general use, when smaller, 
incremental improvements can be introduced with much less risk in terms of 
correctness.

And, especially in the context of a package manager that has almost full 
control about what factors into the build but is outside the source tree of 
each package, I feel it's very useful to think about the whole machinery to be 
up for improvements.

Regarding parallelization for autoconf in particular, I think autoconf could 
very much benefit from having first more explicit effects of each macro, i.e. 
which variables end up being set, which file will be appended to etc. To my 
knowledge this is mostly well documented for the human reader, but not 
programmatically available in the M4 phase at all. E.g. if the script 
generation "knew" that some test macro invocations only affected confdefs.h via 
some atomic write, and no macro affecting some shell variable of consequence 
was in between, those tests could indeed safely be performed in parallel, as 
far as I can see.

Also, there is a discussion of this particular topic on this mailing list 
started by Paul Eggert on June 14, 2022, Message-ID: 
<b2d57714-3519-7929-7ddf-34c4ca774f5e@cs.ucla.edu>

Kind regards,
Thomas  



> On Mar 29, 2023, at 22:12 , Danny McClanahan <dmcc2@hypnicjerk.ai> wrote:
> 
> Hello autoconf,
> 
> I work on a cross-platform package manager named spack (https://spack.io) 
> which builds lots of gnu software from source and has fantastic support for 
> autotools projects. Because spack provides a shell script `cc` to wrap the 
> compiler, each invocation of `cc` for feature tests executed by `./configure` 
> takes a little bit longer than normal, so configuring projects that 
> necessarily have a lot of feature tests takes much longer in spack 
> (particularly `gettext`, which we use as a benchmark in this change: 
> https://github.com/spack/spack/pull/26259). However, we can fix that 
> additional overhead ourselves without any changes in autoconf, by generating 
> our `cc` wrapper instead of doing any logic in the shell script. The reason I 
> messaged this board is because of a separate idea that the above situation 
> made me start thinking about: *parallelizing feature test executions*, in 
> order to speed up `./configure`.
> 
> So a few questions:
> 1. Are there any intrinsic blockers to parallelizing the generated feature 
> tests that execute in an autotools `./configure` script?
>    - For example, I've been told that feature tests currently write to a 
> single output file, which would get clobbered if we were to naively 
> parallelize the test execution, but I was hoping that each test could be made 
> to write to a temp file instead if that's true.
> 2. Which codebase (autoconf, automake, m4, ?) does the work of generating the 
> script that executes tests in serial, and where in that codebase does this 
> occur?
>    - I've been perusing clones of the autoconf and automake codebases and 
> I've been unable to locate the logic that actually executes each test in 
> sequence.
> 3. How should we expose the option to execute tests in parallel?
>    - In order to serve the purpose of improving `./configure` invocation 
> performance, we would probably want to avoid requiring an `autoreconf` (spack 
> avoids executing `autoreconf` wherever possible).
>    - Possibly an option `autoreconf 
> --experimental-also-generate-parallel-tests`, which would enable the end user 
> to execute `./configure --experimental-execute-parallel-tests`?
> 
> Please feel free to link me to any existing content/discussions on this if 
> I've missed them, or redirect me to another mailing list. I'm usually pretty 
> good at figuring things out on my own but have been having some difficulty 
> getting started here.
> 
> Thanks so much,
> Danny



Attachment: smime.p7s
Description: S/MIME cryptographic signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]