libtool-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] Use POSIX nm to simplify AIX export_symbols_cmds.


From: Peter Rosin
Subject: Re: [PATCH 3/4] Use POSIX nm to simplify AIX export_symbols_cmds.
Date: Mon, 14 Mar 2016 12:39:46 +0100
User-agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0

On 2016-03-14 11:01, Michael Haubenwallner wrote:
> On 03/12/2016 12:13 AM, Peter Rosin wrote:
>> On 2016-03-11 22:22, Michael Haubenwallner wrote:
>>> Hi Peter,
>>>
>>> thanks for looking at the patch!
>>>
>>> On 03/10/2016 12:29 PM, Peter Rosin wrote:
>>>> Hi Michael,
>>>>
>>>> I had a look since I wrote a patch for POSIX nm a couple of years ago
>>>> that I never submitted (I didn't see any use case) which looked very
>>>> similar, excepting the AIX-ism in your version.
>>>>
>>>> On 2016-03-10 10:01, Michael Haubenwallner wrote:
>>>>> * m4/libtool.m4 (LT_PATH_NM): Detect POSIX-compatible nm for AIX.  In
>>>>> BSD mode, the AIX nm does not tell whether a symbol is weak, need to use
>>>>> POSIX mode instead.
>>>>> (_LT_CMD_GLOBAL_SYMBOLS): Support POSIX-compatible nm.  Reorder to allow
>>>>> for platform specific hooks during transformation of global_symbol_pipe
>>>>> into C source code.  For AIX, set hook to transform even weak text
>>>>> symbols as text symbols.
>>>>> (_LT_LINKER_SHLIBS): Use global_symbol_pipe to simplify forming the
>>>>> export_symbols_cmds for AIX.
>>>>> ---
>>>>>  m4/libtool.m4 | 101 
>>>>> ++++++++++++++++++++++++++++++++--------------------------
>>>>>  1 file changed, 55 insertions(+), 46 deletions(-)
>>>>>
>>>>> diff --git a/m4/libtool.m4 b/m4/libtool.m4
>>>>> index 2c0e657..6134522 100644
>>>>> --- a/m4/libtool.m4
>>>>> +++ b/m4/libtool.m4
>>>>> @@ -3755,10 +3755,10 @@ _LT_DECL([], [want_nocaseglob], [1],
>>>>>  
>>>>>  # LT_PATH_NM
>>>>>  # ----------
>>>>> -# find the pathname to a BSD- or MS-compatible name lister
>>>>> +# find the pathname to a BSD-, POSIX- or MS-compatible name lister
>>>>>  AC_DEFUN([LT_PATH_NM],
>>>>>  [AC_REQUIRE([AC_PROG_CC])dnl
>>>>> -AC_CACHE_CHECK([for BSD- or MS-compatible name lister (nm)], 
>>>>> lt_cv_path_NM,
>>>>> +AC_CACHE_CHECK([for BSD-, POSIX- or MS-compatible name lister (nm)], 
>>>>> lt_cv_path_NM,
>>>>>  [if test -n "$NM"; then
>>>>>    # Let the user override the test.
>>>>>    lt_cv_path_NM=$NM
>>>>> @@ -3808,6 +3808,26 @@ else
>>>>>    : ${lt_cv_path_NM=no}
>>>>>  fi])
>>>>>  if test no != "$lt_cv_path_NM"; then
>>>>> +  case $host_os in
>>>>> +  aix[[4-9]]*)
>>>>> +    # With AIX nm we need the '-l' flag to get the "weak" information
>>>>> +    # for the Import File, but '-l' is ignored with the '-B' flag.  So
>>>>> +    # we use the '-P' (POSIX) flag instead.  As users often provide the
>>>>> +    # '-B' flag, which conflicts with '-P', we drop any provided flag.
>>>>> +    # AIX nm needs the '-C' flag to disable demangling.  For both GNU
>>>>> +    # and AIX nm, the '-g' flag shows public (global) symbols only,
>>>>> +    # and the '-p' flag disables sorting to improve performance.
>>>>> +    set dummy $lt_cv_path_NM
>>>>> +    case address@hidden|@2 -V 2>&1` in
>>>>> +    *GNU* | *'with BFD'*)
>>>>> +      lt_cv_path_NM="@S|@2 -Bgp"
>>>>> +      ;;
>>>>> +    *)
>>>>> +      lt_cv_path_NM="@S|@2 -PlCgp"
>>>>> +      ;;
>>>>> +    esac
>>>>> +    ;;
>>>>> +  esac
>>>>
>>>> You are overriding the user provided $NM. Not good. If a user says
>>>> NM="nm --this-will-not-work", then you will have to trust that even if
>>>> it is not likely to work. User error, so what? Adding -Bgp or -PlCgp
>>>> can only be done when the user has not specified $NM.
>>>
>>> Agreed. I've added a check whether NM will mark weak symbols instead.
>>
>> I was thinking that you needed to try various flags for each nm in the
>> mentioned loop until you find a good nm/flags combo, and keep looking if you
>> think you might find an even better combo later (i.e. what is there today,
>> where a BSD nm is preferred over other name listers, but tweaked to suite
>> AIX which seemingly prefers posix nm above all else).
> 
> It is the AIX nm only that fails in BSD mode (in ignoring the -l flag).
> GNU nm does correctly report weak symbols even in BSD mode (it does not
> support TLS symbols yet, and fails to ignore -X32 and -X64, but that's
> another story).
> 
> Is the user really required to provide all the nm flags?
> IMO, if the user provides a specific nm to use, we should know how to
> use that one correctly: -B with GNU nm, -PlC with non-GNU (AIX) nm.
> If the nm provided is an unknown one ought to support AIX objects,
> it should support the AIX nm flags at least.

That is not what is there today. If the user sets NM, libtool takes it
as is. The exception is AIX, and that seems like a strange exception.

And yes, I think that if the user tells Libtool what NM to use, the
user should also be required at provide flags. And if the user wants
no flags added, that should be possible too. The only way to support
that (without expanding the interface) is to use the given NM if it is
set.

If you just fix the code that locates a good NM, so that it makes sure
that -Plc is added if AIX nm is found, then the user has little (or no)
reason to override NM in the first place.

> If the user specifies AIX 'nm -B':
> Instead of ignoring the flags, what about replacing -B by -P?

The issue that AIX nm -B isn't supporting weak symbols might be fixed
with a bugfix release? Other parts of the libtool-using project might
require BSD nm output for unrelated reasons? It is not friendly to
override explicit user actions.

> Or better check and abort?

Yes, do check indeed. But no, abort seems a bit premature, what if the
project does not need weak symbols? Why abort then, and how can you
tell beforehand? Maybe warn about it?

> And if the user really wants to force some kown-to-break flags,
> she might be able to set the cache value (lt_cv_path_NM) instead.

NM= is for all practical purposes a shorter form of the cache value,
I think they should be equivalent.

> Also, when the user specifies some nm, I don't want to require her
> to provide the ABI flags (-X32|-X64) as well (done in patch 2/4).
> (ohw, that one also should not modify the cached value)

This feels like a mess, fix the autodetector as outlined above so that
the user has no reason to override NM, then you can add whatever flags
are needed...

>> Then, when you have found an nm/flags combo (or if the user has provided
>> it), and this part was already ok in the patch, you make libtool detect if
>> the $NM interface is posix, bsd, MS dumpbin or ..., and build the symbol
>> pipe accordingly.
>>
>>>> Yes, I see that
>>>> AIX has previously added nm flags behind the back of the user, but there
>>>> is no reason to continue with that now that you are changing things.
>>>>
>>>> You need to modify innards of the lt_tmp_nm loop in the else branch
>>>> a few lines up (just above the context).
>>>>
>>>>>    NM=$lt_cv_path_NM
>>>>>  else
>>>>>    # Didn't find any BSD compatible name lister, look for dumpbin.
>>>>> @@ -3832,7 +3852,7 @@ fi
>>>>>  test -z "$NM" && NM=nm
>>>>>  _LT_SET_TOOL_ABI_FLAG([NM])
>>>>>  AC_SUBST([NM])
>>>>> -_LT_DECL([], [NM], [1], [A BSD- or MS-compatible name lister])dnl
>>>>> +_LT_DECL([], [NM], [1], [A BSD-, POSIX- or MS-compatible name lister])dnl
>>>>>  
>>>>>  AC_CACHE_CHECK([the name lister ($NM) interface], [lt_cv_nm_interface],
>>>>>    [lt_cv_nm_interface="BSD nm"
>>>>> @@ -3847,6 +3867,8 @@ AC_CACHE_CHECK([the name lister ($NM) interface], 
>>>>> [lt_cv_nm_interface],
>>>>>    cat conftest.out >&AS_MESSAGE_LOG_FD
>>>>>    if $GREP 'External.*some_variable' conftest.out > /dev/null; then
>>>>>      lt_cv_nm_interface="MS dumpbin"
>>>>> +  elif $GREP '^[[         ]]*_*some_variable' conftest.out > /dev/null; 
>>>>> then
>>>>> +    lt_cv_nm_interface="POSIX nm"
>>>>
>>>> Isn't this a pretty weak check, perhaps append ' B' and remove the 
>>>> possibility
>>>> for leading whitespace? (see my last comment below for reasoning on spaces)
>>>
>>> As long as the expected symbol name comes first, isn't it POSIX then?
>>> Anyway, 've added "[\t ][\t ]*[A-Za-z]" now, as $symcode is defined later.
>>> And there is no check for BSD style after all.
>>
>> Since it is POSIX output, my point is that it should be fairly safe to assume
>> B as the symbol type, maybe it could be a D if the tools do not put zero-vars
>> in bss, but why wouldn't they? So, perhaps [BD] is a more palatable pattern?
>> I simply don't think you need to match every possible symbol type. Do you?
> 
> There is some standard, and there are implementations, which have bugs. And
> the standard does not tell about weak symbols at all. So I'd prefer to take
> the standard as defining the column layout, but not necessarily the values.
> This is what $symcode is for, but that isn't defined in AC_PROG_NM yet.
> But I'm fine with [A-Z] rather than [A-Za-z] (in the "C" locale).

Yes, $symcode is a property of the symbol pipe, so depends on the input to the
symbol pipe, and the symbol pipe input is what we are trying to find here.
Chicken/egg.

There are no weak symbols involved in the object file that is tested. Why worry
about that? I'm trying to say that something that lists symbol names but isn't
even remotely POSIX is quite likely to start lines with the symbol name, and to
be in some tabular format. Just checking for that is IMHO too weak.

Since we are trying to find the egg (nm) that fits the chicken (symbol pipe), we
can make assumptions about the chicken given the current egg, and hard-code the
symcode. So to speak...

>>>>>    fi
>>>>>    rm -f conftest*])
>>>>>  ])# LT_PATH_NM
>>>>> @@ -4012,8 +4034,33 @@ symcode='[[BCDEGRST]]'
>>>>>  # Regexp to match symbols that can be accessed directly from C.
>>>>>  sympat='\([[_A-Za-z]][[_A-Za-z0-9]]*\)'
>>>>>  
>>>>> +if test "$lt_cv_nm_interface" = "MS dumpbin"; then
>>>>> +  # Gets list of data symbols to import.
>>>>> +  lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'"
>>>>> +  # Adjust the below global symbol transforms to fixup imported 
>>>>> variables.
>>>>> +  lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char 
>>>>> \1;/p'"
>>>>> +  lt_c_name_hook=" -e 's/^I .* \(.*\)$/  {\"\1\", (void *) 0},/p'"
>>>>> +  lt_c_name_lib_hook="\
>>>>> +  -e 's/^I .* \(lib.*\)$/  {\"\1\", (void *) 0},/p'\
>>>>> +  -e 's/^I .* \(.*\)$/  {\"lib\1\", (void *) 0},/p'"
>>>>> +else
>>>>> +  # Disable hooks by default.
>>>>> +  lt_cv_sys_global_symbol_to_import=
>>>>> +  lt_cdecl_hook=
>>>>> +  lt_c_name_hook=
>>>>> +  lt_c_name_lib_hook=
>>>>> +fi
>>>>> +
>>>>>  # Define system-specific variables.
>>>>>  case $host_os in
>>>>> +aix[[4-9]]*)
>>>>> +  case `$NM -V 2>&1` in
>>>>> +  *GNU* | *'with BFD'*) ;;
>>>>> +  *)
>>>>> +    symcode='[[BDLTVWZ]]'
>>>>> +    lt_cdecl_hook=" -e 's/^W/T/p'" # weak text symbol
>>>>> +  esac
>>>>> +  ;;
>>>>>  aix*)
>>>>>    symcode='[[BCDT]]'
>>>>>    ;;
>>>>> @@ -4054,23 +4101,6 @@ case `$NM -V 2>&1` in
>>>>>    symcode='[[ABCDGIRSTW]]' ;;
>>>>>  esac
>>>>>  
>>>>> -if test "$lt_cv_nm_interface" = "MS dumpbin"; then
>>>>> -  # Gets list of data symbols to import.
>>>>> -  lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'"
>>>>> -  # Adjust the below global symbol transforms to fixup imported 
>>>>> variables.
>>>>> -  lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char 
>>>>> \1;/p'"
>>>>> -  lt_c_name_hook=" -e 's/^I .* \(.*\)$/  {\"\1\", (void *) 0},/p'"
>>>>> -  lt_c_name_lib_hook="\
>>>>> -  -e 's/^I .* \(lib.*\)$/  {\"\1\", (void *) 0},/p'\
>>>>> -  -e 's/^I .* \(.*\)$/  {\"lib\1\", (void *) 0},/p'"
>>>>> -else
>>>>> -  # Disable hooks by default.
>>>>> -  lt_cv_sys_global_symbol_to_import=
>>>>> -  lt_cdecl_hook=
>>>>> -  lt_c_name_hook=
>>>>> -  lt_c_name_lib_hook=
>>>>> -fi
>>>>> -
>>>>>  # Transform an extracted symbol line into a proper C declaration.
>>>>>  # Some systems (esp. on ia64) link data and code symbols differently,
>>>>>  # so use this general approach.
>>>>> @@ -4128,6 +4158,9 @@ for ac_symprfx in "" "_"; do
>>>>>  "     s[1]~/address@hidden/{print f,s[1],s[1]; next};"\
>>>>>  "     s[1]~prfx {split(s[1],t,\"@\"); print 
>>>>> f,t[1],substr(t[1],length(prfx))}"\
>>>>>  "     ' prfx=^$ac_symprfx]"
>>>>> +  elif test "$lt_cv_nm_interface" = "POSIX nm"; then
>>>>> +    symxfrm="\\2 $ac_symprfx\\1 \\1"
>>>>> +    lt_cv_sys_global_symbol_pipe="sed -n -e 's/^[[        
>>>>> ]]*$ac_symprfx$sympat[[         ]][[    ]]*\($symcode$symcode*\)[[      
>>>>> ]][[    ]]*.*$opt_cr$/$symxfrm/p'"
>>>>>    else
>>>>>      lt_cv_sys_global_symbol_pipe="sed -n -e 's/^.*[[      
>>>>> ]]\($symcode$symcode*\)[[       ]][[    
>>>>> ]]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'"
>>>>>    fi
>>>>> @@ -5009,19 +5042,7 @@ m4_if([$1], [CXX], [
>>>>>    _LT_TAGVAR(exclude_expsyms, 
>>>>> $1)=['_GLOBAL_OFFSET_TABLE_|_GLOBAL__F[ID]_.*']
>>>>>    case $host_os in
>>>>>    aix[[4-9]]*)export_symbols_cmds
>>>>> -    # If we're using GNU nm, then we don't want the "-C" option.
>>>>> -    # -C means demangle to GNU nm, but means don't demangle to AIX nm.
>>>>> -    # Without the "-l" option, or with the "-B" option, AIX nm treats
>>>>> -    # weak defined symbols like other global defined symbols, whereas
>>>>> -    # GNU nm marks them as "W".
>>>>> -    # While the 'weak' keyword is ignored in the Export File, we need
>>>>> -    # it in the Import File for the 'aix-soname' feature, so we have
>>>>> -    # to replace the "-B" option with "-P" for AIX nm.
>>>>> -    if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then
>>>>> -      _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs 
>>>>> $convenience | awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == 
>>>>> "B") || (\$ 2 == "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") 
>>>>> { print \$ 3 " weak" } else { print \$ 3 } } }'\'' | sort -u > 
>>>>> $export_symbols'
>>>>> -    else
>>>>> -      _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e 
>>>>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if 
>>>>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 
>>>>> 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != 
>>>>> ".")) { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 
>>>>> " weak" } else { print \$ 1 } } }'\'' | sort -u > $export_symbols'
>>>>> -    fi
>>>>> +    _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | 
>>>>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = 
>>>>> "" } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > 
>>>>> $export_symbols'
>>>
>>> On a side note:
>>> As the C++ value is identical to the C one for various platforms,
>>> wouldn't it work for them to do something like:
>>>   _LT_TAGVAR(export_symbols_cmds, $1)=$_LT_TAGVAR(export_symbols_cmds)
>>>   _LT_TAGVAR(exclude_expsyms, $1)=$_LT_TAGVAR(exclude_expsyms)
>>
>> Would you not need to change tag between the get and the set for that
>> to work as I think you intend? What am I missing?
> 
> The right hand side uses the untagged value, which is the C one.

Ah, ok. Yes, that seems like a simplification. Orthogonal to this
discussion though, and doesn't look like it's going to save that much.

Cheers,
Peter



reply via email to

[Prev in Thread] Current Thread [Next in Thread]