bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23556: sort(1): misleading description of option -n


From: Assaf Gordon
Subject: bug#23556: sort(1): misleading description of option -n
Date: Mon, 16 May 2016 15:07:59 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2

Hello Carsten,

On 05/14/2016 10:17 AM, Carsten Hey wrote:
the man page sort(1) contains a misleading description of the option -n:
[...]

     $ man sort | grep -A1 -- --numeric-sort | sed -n -e 's/^ *//' -e '1!p'
     compare according to string numerical value
[...]
This description reads as if this command:

$ printf '%s\n' 'x 9' 'x 10' | sort -n
x 10
x 9
[...]
but instead, -n stops doing its magic after finding the first
non-numeric, non-whitespace character. There is a short and simple
way to summarize this behaviour.

IIUC, you are disputing the accuracy (or clarity) of the term "string numerical 
value" on the manual page,
and not the actual behavior of "sort -n" (which is mandated by posix and has 
been this way for many many years,
as opposed to "sort -V" which was only introduced as GNU extension in coreutils 
version 7.0 in 2008).

The description says "string numeric value" - which (to me) does not mean 
anything other than numeric value
(implying letters will not be sorted properly), but opinions clearly differ.
Using the "--debug" option would immediately reveal the error:

    $ printf '%s\n' 'x 9' 'x 10' | sort --debug -n
    sort: using ‘en_US.UTF-8’ sorting rules
    x 10
    ^ no match for key
    ____
    x 9
    ^ no match for key
    ___


If you have a suggestion for improved wording, I'm sure they can be considered 
for inclusion.
A patch against function usage() in sort.c would go even a longer way.
note that unlike FreeBSD/OpenBSD, the description in the man page is derived from 
"sort --help",
and thus kept brief.

For completeness, here are similar descriptions of "sort -n" from other sources:

POSIX says 
(http://pubs.opengroup.org/onlinepubs/9699919799/utilities/sort.html):
   -n    Restrict the sort key to an initial numeric string, consisting of 
optional
         <blank> characters, optional minus-sign, and zero or more digits with 
an
         optional radix character and thousands separators (as defined in the 
current
         locale), which shall be sorted by arithmetic value. An empty digit 
string
         shall be treated as zero. Leading zeros and signs on zeros shall not 
affect ordering.

The GNU Coreutils manual (which is the official documentation, not the man 
page) says:
(http://www.gnu.org/software/coreutils/manual/coreutils.html#sort-invocation)
  -n
  --numeric-sort
  --sort=numeric
      Sort numerically. The number begins each line and consists of optional 
blanks,
      an optional ‘-’ sign, and zero or more digits   possibly separated by 
thousands
      separators, optionally followed by a decimal-point character and zero or 
more digits.
      An empty number is treated as ‘0’. The LC_NUMERIC locale specifies the 
decimal-point
      character and thousands separator. By default a blank is a space or a 
tab, but
      the LC_CTYPE locale can change this.


OpenBSD's man page has:
     -n, --numeric-sort, --sort=numeric
             An initial numeric string, consisting of optional blank space,
             optional minus sign, and zero or more digits (including decimal
             point) is sorted by arithmetic value.  Leading blank characters
             are ignored.

FreeBSD's man page has:
     -n, --numeric-sort, --sort=numeric
             Sort fields numerically by arithmetic value.  Fields are supposed
             to have optional blanks in the beginning, an optional minus sign,
             zero or more digits (including decimal point and possible thou-
             sand separators).



I'm leaving the bug open, other comments and feedback welcomed.

regards,
 - assaf







reply via email to

[Prev in Thread] Current Thread [Next in Thread]