[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 4/5] Efficient license compatibility checks with --test/--create-
From: |
Ralf Wildenhues |
Subject: |
[PATCH 4/5] Efficient license compatibility checks with --test/--create-testdir. |
Date: |
Sun, 17 Jan 2010 10:51:33 +0100 |
User-agent: |
Mutt/1.5.20 (2009-10-28) |
For licence compatibility testing of requested modules in --test
or --create-testdir modes, walk the directed dependency graph;
in postorder traversal, annotate the set of potential license
incompatibilites in the respective subtrees. Warn about circular
dependencies, where the algorithm does not currently give precise
answers.
Cache as many data as possible, and (ab)use function positional
parameters as cheap emulation of function-local variables for
recursion.
* gnulib-tool (func_license_compatibility): New function.
(func_create_testdir): Use it, simplify accordingly. For
license incompatibilities, output the names of all modules
in the problematic dependency subgraph, not just the offending
modules.
Signed-off-by: Ralf Wildenhues <address@hidden>
---
Hello,
The --test/--create-testdir code currently tries to find license
incompatibilities in the following manner:
For each requested module, it looks at the transitive closure of its
dependencies (excluding tests and their deps), and checks whether any of
the licenses conflict. This is done in order to expose, for example, an
LGPL module depending on a GPLed build tool depending on a GPL module
(we have to assume that the LGPL module may be using code from the
latter GPL module).
With lots of requested modules (or all of them), the code walks the
dependency graph many times; very expensive, and the prime offender for
<http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/20351/focus=20361>.
The directed graph of git gnulib modules and dependencies currently
consists of roughly 1900 modules and 4300 listed dependencies, so it
is fairly sparse. It is almost a tree: it contains 13 cycles, of which
7 are typos, and the rest are shown here:
close -> fclose -> close
mbrtowc -> mbsinit -> mbrtowc
xalloc -> xalloc-die -> xalloc
,-> chdir-long ----------v
,-------------> save-cwd -> xgetcwd -> getcwd -> openat
openat -> fdopendir ---^ ^
\ `-> openat-die ----------------------'
`--------------^
My approach to license compatibility testing assumes a tree; it warns
about, and produces less precise results in the presence of, cycles.
With respect to the cycles involving xalloc and openat, there is no
problem in this approach: these two modules are GPL, thus any
incompatibility would still be noted, but not all GPL dependencies
might be listed.
However, maybe the first three cycles can be eliminated or somehow
avoided nonetheless:
- close depending on fclose looks like it could be avoided if
gl_REPLACE_FCLOSE is removed from close.m4,
- xalloc-die probably doesn't need to depend on xalloc if some files
are listed twice.
- modules in small cycles might just be merged (and one be made an
empty alias of the other).
I further chose to change the output of the license checking, to produce
not only the pairs (requested_module, incompatible_dependency) but all
modules in the subgraph containing the incompatibility; it might be
easier to find the dependency chain leading to the issue that way.
(This can trivially be reverted if it is undesirable.)
The new output currently looks like this:
warning: cannot determine license compatibility: circular dependency of
mbrtowc
warning: cannot determine license compatibility: circular dependency of xalloc
warning: cannot determine license compatibility: circular dependency of close
warning: cannot determine license compatibility: circular dependency of openat
warning: cannot determine license compatibility: circular dependency of openat
warning: cannot determine license compatibility: circular dependency of openat
warning: cannot determine license compatibility: circular dependency of
unistdio/u16-sprintf
warning: cannot determine license compatibility: circular dependency of
unistdio/u16-u16-sprintf
warning: cannot determine license compatibility: circular dependency of
unistdio/u32-sprintf
warning: cannot determine license compatibility: circular dependency of
unistdio/u32-u32-sprintf
warning: cannot determine license compatibility: circular dependency of
unistdio/u8-sprintf
warning: cannot determine license compatibility: circular dependency of
unistdio/u8-u8-sprintf
warning: cannot determine license compatibility: circular dependency of
unistdio/ulc-sprintf
warning: module euidaccess depends on a module with an incompatible license
through:
euidaccess
exit
exitfail
getgroups
group-member
xalloc
xalloc-die
warning: module fts-lgpl depends on a module with an incompatible license
through:
chdir-long
chown
d-ino
exit
exitfail
fdopendir
fts-lgpl
getcwd
lchown
openat
openat-die
save-cwd
unistd-safer
xalloc
xalloc-die
xgetcwd
warning: module obstack depends on a module with an incompatible license
through:
exit
exitfail
obstack
warning: module regex depends on a module with an incompatible license
through:
langinfo
nl_langinfo
regex
(BTW, speak up if you want me to post some of the debugging code that I
used for this.)
Cheers,
Ralf
ChangeLog | 16 ++++++
gnulib-tool | 157 +++++++++++++++++++++++++++++++++++++++++++++--------------
2 files changed, 137 insertions(+), 36 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 260821a..a7d0163 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,21 @@
2010-01-17 Ralf Wildenhues <address@hidden>
+ Efficient license compatibility checks with --test/--create-testdir.
+ For licence compatibility testing of requested modules in --test
+ or --create-testdir modes, walk the directed dependency graph;
+ in postorder traversal, annotate the set of potential license
+ incompatibilites in the respective subtrees. Warn about circular
+ dependencies, where the algorithm does not currently give precise
+ answers.
+ Cache as many data as possible, and (ab)use function positional
+ parameters as cheap emulation of function-local variables for
+ recursion.
+ * gnulib-tool (func_license_compatibility): New function.
+ (func_create_testdir): Use it, simplify accordingly. For
+ license incompatibilities, output the names of all modules
+ in the problematic dependency subgraph, not just the offending
+ modules.
+
gnulib-tool: small sed optimization.
* gnulib-tool (func_get_automake_snippet): Merge multiple sed
invocations.
diff --git a/gnulib-tool b/gnulib-tool
index 5ab640f..00b4b08 100755
--- a/gnulib-tool
+++ b/gnulib-tool
@@ -3996,6 +3996,101 @@ s,//*$,/,'
echo " - invoke ${macro_prefix}_INIT in $configure_ac."
}
+
+# func_license_compatibility module...
+# Annotate module graph according to which licenses they are compatible with.
+# This is not precise in presence of circular dependencies.
+#
+# NOTE: This function calls itself recursively. Since sh doesn't have
+# local variables (without unportable `local'), the only variables
+# that make it across the recursion are the positional parameters.
+#
+# Input:
+# $1 ... list of modules to check
+#
+# Output:
+# ${module}_gpl2_incomp nonempty if its transitive dependency closure
+# contains a module incompatible with GPLv2+
+# ${module}_lgpl_incomp likewise for LGPL
+# ${module}_lgpl2_incomp likewise for LGPLv2+
+# ${module}_license_seen nonempty if this module has already been seen
+func_license_compatibility ()
+{
+ for lic_module
+ do
+ func_cache_var $lic_module
+ lic_cache=$cachevar
+ eval "lic_seen=\$${lic_cache}_license_seen
+ lic_circular=\$${lic_cache}_circular
+ ${lic_cache}_circular=yes"
+ if test -n "$lic_seen"; then
+ continue
+ fi
+ if test -n "$lic_circular"; then
+ echo "warning: cannot determine license compatibility: circular
dependency of $lic_module"
+ continue
+ fi
+ # Don't consider $module to depend on $module-tests here. This is needed
+ # becauses tests are implicitly GPL and may depend on GPL modules; we don't
+ # want a warning in this case.
+ func_get_dependencies $lic_module
+
+ # Save the module name, cache var name, and list of dependencies in the
+ # positional parameters, for recursion.
+ lic_deps=$module_deps
+ set $lic_cache $lic_module $lic_deps
+ func_license_compatibility $lic_deps
+ lic_cache=$1
+ lic_module=$2
+ shift; shift
+ lic_deps=$*
+ gpl2_incomp= lgpl_incomp= lgpl2_incomp=
+ for lic_dep in $lic_deps; do
+ func_cache_var $lic_dep
+ eval '
+ if test -n "$'$cachevar'_gpl2_incomp"; then
+ func_append gpl2_incomp "$nl$'$cachevar'_gpl2_incomp"
+ fi
+ if test -n "$'$cachevar'_lgpl_incomp"; then
+ func_append lgpl_incomp "$nl$'$cachevar'_lgpl_incomp"
+ fi
+ if test -n "$'$cachevar'_lgpl2_incomp"; then
+ func_append lgpl2_incomp "$nl$'$cachevar'_lgpl2_incomp"
+ fi'
+ done
+ func_get_license $lic_module
+ case $module_license in
+ 'GPLed build tool') ;;
+ 'public domain' | 'unlimited' | 'unmodifiable license text') ;;
+ 'LGPLv2+') ;;
+ 'GPLv2+')
+ func_append lgpl_incomp "$nl$lic_module"
+ func_append lgpl2_incomp "$nl$lic_module"
+ ;;
+ 'LGPL')
+ func_append lgpl2_incomp "$nl$lic_module"
+ ;;
+ *)
+ func_append gpl2_incomp "$nl$lic_module"
+ func_append lgpl_incomp "$nl$lic_module"
+ func_append lgpl2_incomp "$nl$lic_module"
+ ;;
+ esac
+ if test -n "$gpl2_incomp"; then
+ eval ${lic_cache}'_gpl2_incomp=`echo "$lic_module$gpl2_incomp" | sort
-u`'
+ fi
+ if test -n "$lgpl_incomp"; then
+ eval ${lic_cache}'_lgpl_incomp=`echo "$lic_module$lgpl_incomp" | sort
-u`'
+ fi
+ if test -n "$lgpl2_incomp"; then
+ eval ${lic_cache}'_lgpl2_incomp=`echo "$lic_module$lgpl2_incomp" | sort
-u`'
+ fi
+ eval "${lic_cache}_license_seen=yes
+ ${lic_cache}_circular="
+ done
+}
+
+
# func_create_testdir testdir modules
# Input:
# - local_gnulib_dir from --local-dir
@@ -4021,46 +4116,36 @@ func_create_testdir ()
# $module-tests. Need this becauses tests are implicitly GPL and may depend
# on GPL modules - therefore we don't want a warning in this case.
inctests=""
+
+ func_license_compatibility $saved_modules
for requested_module in $saved_modules; do
func_get_license "$requested_module"
requested_license=$module_license
if test "$requested_license" != GPL; then
- # Here we use func_modules_transitive_closure, not just
- # func_get_dependencies, so that we also detect weird situations like
- # an LGPL module which depends on a GPLed build tool module which depends
- # on a GPL module.
- modules="$requested_module"
- func_modules_transitive_closure
- for module in $modules; do
- func_get_license "$module"
- license=$module_license
- case "$license" in
- 'GPLed build tool') ;;
- 'public domain' | 'unlimited' | 'unmodifiable license text') ;;
- *)
- case "$requested_license" in
- GPLv2+)
- case "$license" in
- GPLv2+ | LGPLv2+) ;;
- *) echo "warning: module $requested_module depends on a
module with an incompatible license: $module" 1>&2 ;;
- esac
- ;;
- LGPL)
- case "$license" in
- LGPL | LGPLv2+) ;;
- *) echo "warning: module $requested_module depends on a
module with an incompatible license: $module" 1>&2 ;;
- esac
- ;;
- LGPLv2+)
- case "$license" in
- LGPLv2+) ;;
- *) echo "warning: module $requested_module depends on a
module with an incompatible license: $module" 1>&2 ;;
- esac
- ;;
- esac
- ;;
- esac
- done
+ func_cache_var $requested_module
+ eval "gpl2_state=\$${cachevar}_gpl2_incomp
+ lgpl_state=\$${cachevar}_lgpl_incomp
+ lgpl2_state=\$${cachevar}_lgpl2_incomp"
+ case "$requested_license" in
+ GPLv2+)
+ if test -n "$gpl2_state"; then
+ echo "warning: module $requested_module has an incompatible
license dependency subgraph:" 1>&2
+ echo "$gpl2_state" | sed 's/^/ /' 1>&2
+ fi
+ ;;
+ LGPL)
+ if test -n "$lgpl_state"; then
+ echo "warning: module $requested_module has an incompatible
license dependency subgraph:" 1>&2
+ echo "$lgpl_state" | sed 's/^/ /' 1>&2
+ fi
+ ;;
+ LGPLv2+)
+ if test -n "$lgpl2_state"; then
+ echo "warning: module $requested_module has an incompatible
license dependency subgraph:" 1>&2
+ echo "$lgpl2_state" | sed 's/^/ /' 1>&2
+ fi
+ ;;
+ esac
fi
done
modules="$saved_modules"
--
1.6.5.1.31.gad12b
- [PATCH 0/5] speed up gnulib-tool some more, Ralf Wildenhues, 2010/01/17
- [PATCH 1/5] gnulib-tool: cache module metainformation., Ralf Wildenhues, 2010/01/17
- [PATCH 2/5] gnulib-tool: avoid forks with func_get_* functions., Ralf Wildenhues, 2010/01/17
- [PATCH 3/5] gnulib-tool: small sed optimization., Ralf Wildenhues, 2010/01/17
- [PATCH 4/5] Efficient license compatibility checks with --test/--create-testdir.,
Ralf Wildenhues <=
- [PATCH 5/5] Drop self-dependencies in modules/unistdio/*., Ralf Wildenhues, 2010/01/17
- Re: [PATCH 0/5] speed up gnulib-tool some more, Bruno Haible, 2010/01/17