libtool-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 3/4] Use POSIX nm to simplify AIX export_symbols_cmds.


From: Michael Haubenwallner
Subject: Re: [PATCH 3/4] Use POSIX nm to simplify AIX export_symbols_cmds.
Date: Mon, 14 Mar 2016 11:01:04 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0


On 03/12/2016 12:13 AM, Peter Rosin wrote:
> On 2016-03-11 22:22, Michael Haubenwallner wrote:
>> Hi Peter,
>>
>> thanks for looking at the patch!
>>
>> On 03/10/2016 12:29 PM, Peter Rosin wrote:
>>> Hi Michael,
>>>
>>> I had a look since I wrote a patch for POSIX nm a couple of years ago
>>> that I never submitted (I didn't see any use case) which looked very
>>> similar, excepting the AIX-ism in your version.
>>>
>>> On 2016-03-10 10:01, Michael Haubenwallner wrote:
>>>> * m4/libtool.m4 (LT_PATH_NM): Detect POSIX-compatible nm for AIX.  In
>>>> BSD mode, the AIX nm does not tell whether a symbol is weak, need to use
>>>> POSIX mode instead.
>>>> (_LT_CMD_GLOBAL_SYMBOLS): Support POSIX-compatible nm.  Reorder to allow
>>>> for platform specific hooks during transformation of global_symbol_pipe
>>>> into C source code.  For AIX, set hook to transform even weak text
>>>> symbols as text symbols.
>>>> (_LT_LINKER_SHLIBS): Use global_symbol_pipe to simplify forming the
>>>> export_symbols_cmds for AIX.
>>>> ---
>>>>  m4/libtool.m4 | 101 
>>>> ++++++++++++++++++++++++++++++++--------------------------
>>>>  1 file changed, 55 insertions(+), 46 deletions(-)
>>>>
>>>> diff --git a/m4/libtool.m4 b/m4/libtool.m4
>>>> index 2c0e657..6134522 100644
>>>> --- a/m4/libtool.m4
>>>> +++ b/m4/libtool.m4
>>>> @@ -3755,10 +3755,10 @@ _LT_DECL([], [want_nocaseglob], [1],
>>>>  
>>>>  # LT_PATH_NM
>>>>  # ----------
>>>> -# find the pathname to a BSD- or MS-compatible name lister
>>>> +# find the pathname to a BSD-, POSIX- or MS-compatible name lister
>>>>  AC_DEFUN([LT_PATH_NM],
>>>>  [AC_REQUIRE([AC_PROG_CC])dnl
>>>> -AC_CACHE_CHECK([for BSD- or MS-compatible name lister (nm)], 
>>>> lt_cv_path_NM,
>>>> +AC_CACHE_CHECK([for BSD-, POSIX- or MS-compatible name lister (nm)], 
>>>> lt_cv_path_NM,
>>>>  [if test -n "$NM"; then
>>>>    # Let the user override the test.
>>>>    lt_cv_path_NM=$NM
>>>> @@ -3808,6 +3808,26 @@ else
>>>>    : ${lt_cv_path_NM=no}
>>>>  fi])
>>>>  if test no != "$lt_cv_path_NM"; then
>>>> +  case $host_os in
>>>> +  aix[[4-9]]*)
>>>> +    # With AIX nm we need the '-l' flag to get the "weak" information
>>>> +    # for the Import File, but '-l' is ignored with the '-B' flag.  So
>>>> +    # we use the '-P' (POSIX) flag instead.  As users often provide the
>>>> +    # '-B' flag, which conflicts with '-P', we drop any provided flag.
>>>> +    # AIX nm needs the '-C' flag to disable demangling.  For both GNU
>>>> +    # and AIX nm, the '-g' flag shows public (global) symbols only,
>>>> +    # and the '-p' flag disables sorting to improve performance.
>>>> +    set dummy $lt_cv_path_NM
>>>> +    case address@hidden|@2 -V 2>&1` in
>>>> +    *GNU* | *'with BFD'*)
>>>> +      lt_cv_path_NM="@S|@2 -Bgp"
>>>> +      ;;
>>>> +    *)
>>>> +      lt_cv_path_NM="@S|@2 -PlCgp"
>>>> +      ;;
>>>> +    esac
>>>> +    ;;
>>>> +  esac
>>>
>>> You are overriding the user provided $NM. Not good. If a user says
>>> NM="nm --this-will-not-work", then you will have to trust that even if
>>> it is not likely to work. User error, so what? Adding -Bgp or -PlCgp
>>> can only be done when the user has not specified $NM.
>>
>> Agreed. I've added a check whether NM will mark weak symbols instead.
> 
> I was thinking that you needed to try various flags for each nm in the
> mentioned loop until you find a good nm/flags combo, and keep looking if you
> think you might find an even better combo later (i.e. what is there today,
> where a BSD nm is preferred over other name listers, but tweaked to suite
> AIX which seemingly prefers posix nm above all else).

It is the AIX nm only that fails in BSD mode (in ignoring the -l flag).
GNU nm does correctly report weak symbols even in BSD mode (it does not
support TLS symbols yet, and fails to ignore -X32 and -X64, but that's
another story).

Is the user really required to provide all the nm flags?
IMO, if the user provides a specific nm to use, we should know how to
use that one correctly: -B with GNU nm, -PlC with non-GNU (AIX) nm.
If the nm provided is an unknown one ought to support AIX objects,
it should support the AIX nm flags at least.

If the user specifies AIX 'nm -B':
Instead of ignoring the flags, what about replacing -B by -P?
Or better check and abort?

And if the user really wants to force some kown-to-break flags,
she might be able to set the cache value (lt_cv_path_NM) instead.

Also, when the user specifies some nm, I don't want to require her
to provide the ABI flags (-X32|-X64) as well (done in patch 2/4).
(ohw, that one also should not modify the cached value)

> Then, when you have found an nm/flags combo (or if the user has provided
> it), and this part was already ok in the patch, you make libtool detect if
> the $NM interface is posix, bsd, MS dumpbin or ..., and build the symbol
> pipe accordingly.
> 
>>> Yes, I see that
>>> AIX has previously added nm flags behind the back of the user, but there
>>> is no reason to continue with that now that you are changing things.
>>>
>>> You need to modify innards of the lt_tmp_nm loop in the else branch
>>> a few lines up (just above the context).
>>>
>>>>    NM=$lt_cv_path_NM
>>>>  else
>>>>    # Didn't find any BSD compatible name lister, look for dumpbin.
>>>> @@ -3832,7 +3852,7 @@ fi
>>>>  test -z "$NM" && NM=nm
>>>>  _LT_SET_TOOL_ABI_FLAG([NM])
>>>>  AC_SUBST([NM])
>>>> -_LT_DECL([], [NM], [1], [A BSD- or MS-compatible name lister])dnl
>>>> +_LT_DECL([], [NM], [1], [A BSD-, POSIX- or MS-compatible name lister])dnl
>>>>  
>>>>  AC_CACHE_CHECK([the name lister ($NM) interface], [lt_cv_nm_interface],
>>>>    [lt_cv_nm_interface="BSD nm"
>>>> @@ -3847,6 +3867,8 @@ AC_CACHE_CHECK([the name lister ($NM) interface], 
>>>> [lt_cv_nm_interface],
>>>>    cat conftest.out >&AS_MESSAGE_LOG_FD
>>>>    if $GREP 'External.*some_variable' conftest.out > /dev/null; then
>>>>      lt_cv_nm_interface="MS dumpbin"
>>>> +  elif $GREP '^[[  ]]*_*some_variable' conftest.out > /dev/null; then
>>>> +    lt_cv_nm_interface="POSIX nm"
>>>
>>> Isn't this a pretty weak check, perhaps append ' B' and remove the 
>>> possibility
>>> for leading whitespace? (see my last comment below for reasoning on spaces)
>>
>> As long as the expected symbol name comes first, isn't it POSIX then?
>> Anyway, 've added "[\t ][\t ]*[A-Za-z]" now, as $symcode is defined later.
>> And there is no check for BSD style after all.
> 
> Since it is POSIX output, my point is that it should be fairly safe to assume
> B as the symbol type, maybe it could be a D if the tools do not put zero-vars
> in bss, but why wouldn't they? So, perhaps [BD] is a more palatable pattern?
> I simply don't think you need to match every possible symbol type. Do you?

There is some standard, and there are implementations, which have bugs. And
the standard does not tell about weak symbols at all. So I'd prefer to take
the standard as defining the column layout, but not necessarily the values.
This is what $symcode is for, but that isn't defined in AC_PROG_NM yet.
But I'm fine with [A-Z] rather than [A-Za-z] (in the "C" locale).

>>>>    fi
>>>>    rm -f conftest*])
>>>>  ])# LT_PATH_NM
>>>> @@ -4012,8 +4034,33 @@ symcode='[[BCDEGRST]]'
>>>>  # Regexp to match symbols that can be accessed directly from C.
>>>>  sympat='\([[_A-Za-z]][[_A-Za-z0-9]]*\)'
>>>>  
>>>> +if test "$lt_cv_nm_interface" = "MS dumpbin"; then
>>>> +  # Gets list of data symbols to import.
>>>> +  lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'"
>>>> +  # Adjust the below global symbol transforms to fixup imported variables.
>>>> +  lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char 
>>>> \1;/p'"
>>>> +  lt_c_name_hook=" -e 's/^I .* \(.*\)$/  {\"\1\", (void *) 0},/p'"
>>>> +  lt_c_name_lib_hook="\
>>>> +  -e 's/^I .* \(lib.*\)$/  {\"\1\", (void *) 0},/p'\
>>>> +  -e 's/^I .* \(.*\)$/  {\"lib\1\", (void *) 0},/p'"
>>>> +else
>>>> +  # Disable hooks by default.
>>>> +  lt_cv_sys_global_symbol_to_import=
>>>> +  lt_cdecl_hook=
>>>> +  lt_c_name_hook=
>>>> +  lt_c_name_lib_hook=
>>>> +fi
>>>> +
>>>>  # Define system-specific variables.
>>>>  case $host_os in
>>>> +aix[[4-9]]*)
>>>> +  case `$NM -V 2>&1` in
>>>> +  *GNU* | *'with BFD'*) ;;
>>>> +  *)
>>>> +    symcode='[[BDLTVWZ]]'
>>>> +    lt_cdecl_hook=" -e 's/^W/T/p'" # weak text symbol
>>>> +  esac
>>>> +  ;;
>>>>  aix*)
>>>>    symcode='[[BCDT]]'
>>>>    ;;
>>>> @@ -4054,23 +4101,6 @@ case `$NM -V 2>&1` in
>>>>    symcode='[[ABCDGIRSTW]]' ;;
>>>>  esac
>>>>  
>>>> -if test "$lt_cv_nm_interface" = "MS dumpbin"; then
>>>> -  # Gets list of data symbols to import.
>>>> -  lt_cv_sys_global_symbol_to_import="sed -n -e 's/^I .* \(.*\)$/\1/p'"
>>>> -  # Adjust the below global symbol transforms to fixup imported variables.
>>>> -  lt_cdecl_hook=" -e 's/^I .* \(.*\)$/extern __declspec(dllimport) char 
>>>> \1;/p'"
>>>> -  lt_c_name_hook=" -e 's/^I .* \(.*\)$/  {\"\1\", (void *) 0},/p'"
>>>> -  lt_c_name_lib_hook="\
>>>> -  -e 's/^I .* \(lib.*\)$/  {\"\1\", (void *) 0},/p'\
>>>> -  -e 's/^I .* \(.*\)$/  {\"lib\1\", (void *) 0},/p'"
>>>> -else
>>>> -  # Disable hooks by default.
>>>> -  lt_cv_sys_global_symbol_to_import=
>>>> -  lt_cdecl_hook=
>>>> -  lt_c_name_hook=
>>>> -  lt_c_name_lib_hook=
>>>> -fi
>>>> -
>>>>  # Transform an extracted symbol line into a proper C declaration.
>>>>  # Some systems (esp. on ia64) link data and code symbols differently,
>>>>  # so use this general approach.
>>>> @@ -4128,6 +4158,9 @@ for ac_symprfx in "" "_"; do
>>>>  "     s[1]~/address@hidden/{print f,s[1],s[1]; next};"\
>>>>  "     s[1]~prfx {split(s[1],t,\"@\"); print 
>>>> f,t[1],substr(t[1],length(prfx))}"\
>>>>  "     ' prfx=^$ac_symprfx]"
>>>> +  elif test "$lt_cv_nm_interface" = "POSIX nm"; then
>>>> +    symxfrm="\\2 $ac_symprfx\\1 \\1"
>>>> +    lt_cv_sys_global_symbol_pipe="sed -n -e 's/^[[         
>>>> ]]*$ac_symprfx$sympat[[         ]][[    ]]*\($symcode$symcode*\)[[      
>>>> ]][[    ]]*.*$opt_cr$/$symxfrm/p'"
>>>>    else
>>>>      lt_cv_sys_global_symbol_pipe="sed -n -e 's/^.*[[       
>>>> ]]\($symcode$symcode*\)[[       ]][[    
>>>> ]]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'"
>>>>    fi
>>>> @@ -5009,19 +5042,7 @@ m4_if([$1], [CXX], [
>>>>    _LT_TAGVAR(exclude_expsyms, 
>>>> $1)=['_GLOBAL_OFFSET_TABLE_|_GLOBAL__F[ID]_.*']
>>>>    case $host_os in
>>>>    aix[[4-9]]*)export_symbols_cmds
>>>> -    # If we're using GNU nm, then we don't want the "-C" option.
>>>> -    # -C means demangle to GNU nm, but means don't demangle to AIX nm.
>>>> -    # Without the "-l" option, or with the "-B" option, AIX nm treats
>>>> -    # weak defined symbols like other global defined symbols, whereas
>>>> -    # GNU nm marks them as "W".
>>>> -    # While the 'weak' keyword is ignored in the Export File, we need
>>>> -    # it in the Import File for the 'aix-soname' feature, so we have
>>>> -    # to replace the "-B" option with "-P" for AIX nm.
>>>> -    if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then
>>>> -      _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs $convenience 
>>>> | awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 
>>>> == "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") { print \$ 3 " 
>>>> weak" } else { print \$ 3 } } }'\'' | sort -u > $export_symbols'
>>>> -    else
>>>> -      _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e 
>>>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if 
>>>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 
>>>> 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != 
>>>> ".")) { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 
>>>> " weak" } else { print \$ 1 } } }'\'' | sort -u > $export_symbols'
>>>> -    fi
>>>> +    _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | 
>>>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = "" 
>>>> } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > 
>>>> $export_symbols'
>>
>> On a side note:
>> As the C++ value is identical to the C one for various platforms,
>> wouldn't it work for them to do something like:
>>   _LT_TAGVAR(export_symbols_cmds, $1)=$_LT_TAGVAR(export_symbols_cmds)
>>   _LT_TAGVAR(exclude_expsyms, $1)=$_LT_TAGVAR(exclude_expsyms)
> 
> Would you not need to change tag between the get and the set for that
> to work as I think you intend? What am I missing?

The right hand side uses the untagged value, which is the C one.

>>>>      ;;
>>>>    pw32*)
>>>>      _LT_TAGVAR(export_symbols_cmds, $1)=$ltdll_cmds
>>>> @@ -5464,19 +5485,7 @@ _LT_EOF
>>>>    exp_sym_flag='-Bexport'
>>>>    no_entry_flag=
>>>>        else
>>>> -  # If we're using GNU nm, then we don't want the "-C" option.
>>>> -  # -C means demangle to GNU nm, but means don't demangle to AIX nm.
>>>> -  # Without the "-l" option, or with the "-B" option, AIX nm treats
>>>> -  # weak defined symbols like other global defined symbols, whereas
>>>> -  # GNU nm marks them as "W".
>>>> -  # While the 'weak' keyword is ignored in the Export File, we need
>>>> -  # it in the Import File for the 'aix-soname' feature, so we have
>>>> -  # to replace the "-B" option with "-P" for AIX nm.
>>>> -  if $NM -V 2>&1 | $GREP 'GNU' > /dev/null; then
>>>> -    _LT_TAGVAR(export_symbols_cmds, $1)='$NM -Bpg $libobjs $convenience | 
>>>> awk '\''{ if (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == 
>>>> "W")) && ([substr](\$ 3,1,1) != ".")) { if (\$ 2 == "W") { print \$ 3 " 
>>>> weak" } else { print \$ 3 } } }'\'' | sort -u > $export_symbols'
>>>> -  else
>>>> -    _LT_TAGVAR(export_symbols_cmds, $1)='`func_echo_all $NM | $SED -e 
>>>> '\''s/B\([[^B]]*\)$/P\1/'\''` -PCpgl $libobjs $convenience | awk '\''{ if 
>>>> (((\$ 2 == "T") || (\$ 2 == "D") || (\$ 2 == "B") || (\$ 2 == "L") || (\$ 
>>>> 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) && ([substr](\$ 1,1,1) != 
>>>> ".")) { if ((\$ 2 == "W") || (\$ 2 == "V") || (\$ 2 == "Z")) { print \$ 1 
>>>> " weak" } else { print \$ 1 } } }'\'' | sort -u > $export_symbols'
>>>> -  fi
>>>> +  _LT_TAGVAR(export_symbols_cmds, $1)='$NM $libobjs $convenience | 
>>>> $global_symbol_pipe | $EGREP -v " ($exclude_expsyms)$" | awk '\''{ kw = "" 
>>>> } /^[[VWZ]] / { kw = " weak" } { print $ 3 kw }'\'' | sort -u > 
>>>> $export_symbols'
>>
>> The main motivation here is this simplification after all,
>> as this needs another symbol exclusion (patch 4/4), which
>> does make sense for the preloaded symbols list as well.
> 
> Yes, it is much cleaner to adjust the symbol pipe according to $NM, than
> trying to "fix" $NM by adding flags. That part is nice indeed!

Thanks!
/haubi/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]