[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-libunistring] [bug #54453] abort in u8_possible_linebreaks in libun

From: Bruno Haible
Subject: [bug-libunistring] [bug #54453] abort in u8_possible_linebreaks in libunistring 0.9.10
Date: Mon, 3 Jan 2022 13:04:51 -0500 (EST)
User-agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:95.0) Gecko/20100101 Firefox/95.0

Update of bug #54453 (project libunistring):

                  Status:               Need Info => Fixed                  
             Assigned to:                    None => haible                 
             Open/Closed:                    Open => Closed                 


Follow-up Comment #8:

I think the root cause of the problem is that Alpine packages get built with
deficient versions of 'sed', 'join' and other POSIX utilities.

Here's my complete analysis:

1) I built, on a glibc system, the packages that you mention: gettext, against libunistring 0.9.10.

$ ./configure --prefix=/tmp/inst
$ make
$ make install

2) The symbols in the libunistring.so.2.1.0 that I built and the one in
are different:

$ nm --dynamic /tmp/inst/lib/libgettextlib.so | grep lbrk
0000000000035280 T unilbrk_is_all_ascii
0000000000035230 T unilbrk_is_utf8_encoding
0000000000057b80 R unilbrkprop
00000000000578a0 R unilbrk_table
$ nm --dynamic /gnu-inst-libunistring/0.9.10/lib/libunistring.so.2 | grep
0000000000022b60 T libunistring_unilbrk_is_all_ascii
0000000000022b20 T libunistring_unilbrk_is_utf8_encoding
00000000000bff60 R libunistring_unilbrkprop
00000000000bfbc0 R libunistring_unilbrk_table

The Ubuntu binary packages have these symbols:

$ nm --dynamic /usr/lib/x86_64-linux-gnu/libunistring.so.2.1.0 | grep lbrk
0000000000021f50 T libunistring_unilbrk_is_all_ascii
0000000000021f00 T libunistring_unilbrk_is_utf8_encoding
00000000000c0260 R libunistring_unilbrkprop
00000000000bfec0 R libunistring_unilbrk_table
$ nm --dynamic /usr/lib/i386-linux-gnu/libunistring.so.2.1.0 | grep lbrk
0001ea10 T libunistring_unilbrk_is_all_ascii
0001e9c0 T libunistring_unilbrk_is_utf8_encoding
000bc0a0 R libunistring_unilbrkprop
000bbd00 R libunistring_unilbrk_table

With the Alpine apk, however, the symbols are different:

$ nm --dynamic usr/lib/libunistring.so.2.1.0 | grep lbrk
000000000001ecde T unilbrk_is_all_ascii
000000000001eca6 T unilbrk_is_utf8_encoding
00000000000b4180 R unilbrkprop
00000000000b3de0 R unilbrk_table

3) A program like msgmerge is linked against these libraries, in order:

msgmerge - libgettextsrc.so - libgettextlib.so - libunistring.so

This means, by the usual ELF rules, symbols in libgettextsrc or libgettextlib
will override symbols in libunistring.

The invocation chain, with binary location, of your example is:
On Alpine Linux:

main                                                 - msgmerge
write-po.c                                           - libgettextsrc.so
ulc_width_linebreaks                                 - libunistring.so
u8_width_linebreaks                                  - libunistring.so
u8_possible_linebreaks                               - libunistring.so
unilbrkprop, unilbrk_table                           - libgettextlib.so

Whereas on Ubuntu and other distros it is:

main                                                 - msgmerge
write-po.c                                           - libgettextsrc.so
ulc_width_linebreaks                                 - libunistring.so
u8_width_linebreaks                                  - libunistring.so
u8_possible_linebreaks                               - libunistring.so
libunistring_unilbrkprop, libunistring_unilbrk_table - libunistring.so

4) The libunistring/lib/Makefile.am contains logic to make the libunistring
library namespace-clean, that is, to avoid collisions with symbols that may
possibly occur in executables and other libraries. This is done by prefixing
all *internal* symbols of the library with 'libunistring_'. unilbrkprop,
unilbrk_table are such symbols. This is what, e.g. on Ubuntu, avoids a
collision between 'unilbrk_table' (in libgettextlib) and
'libunistring_unilbrk_table' (in libunistring).

This explains why the crash is seen on Alpine Linux but not on other OSes.

5) From the lack of 'libunistring_' prefix in the Alpine binaries one can
infer that the logic in libunistring/lib/Makefile.am did not work right. That
is, the libunistring.sym file that it composed was incorrect.

In the mean time, there have been two improvements to this logic, to work
around deficient versions of 'join' and 'sed' on Alpine Linux.

For 'join', the problem is described in
https://lists.gnu.org/archive/html/bug-gnulib/2021-04/msg00041.html , and the
workaround is in

For 'sed', one of the problems is described in
http://lists.busybox.net/pipermail/busybox/2022-January/089400.html , and the
workaround is in

This explains why the symbol list was wrong on Alpine Linux and correct
everywhere else.

6) The build recipe in Alpine looks reasonable.
. There's no apparent bug here.

In summary, it costs me time (two workarounds already, and this bug report
here), to deal with the deficient utilities in Alpine Linux. Things would be
simpler if Alpine Linux packages would be built with GNU coreutils and GNU sed
in $PATH.


Reply to this item at:


  Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]