emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#18073: closed (defect with sort multiple arguments


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#18073: closed (defect with sort multiple arguments)
Date: Mon, 21 Jul 2014 21:14:03 +0000

Your message dated Mon, 21 Jul 2014 15:13:05 -0600
with message-id <address@hidden>
and subject line Re: bug#18073: defect with sort multiple arguments
has caused the debbugs.gnu.org bug report #18073,
regarding defect with sort multiple arguments
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
18073: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18073
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: defect with sort multiple arguments Date: Mon, 21 Jul 2014 14:57:19 -0500
I was seeing some odd behaviour with sort -n -u.  I ran sort -n -u dataset and expected the same output as sort -n dataset| uniq but instead got something different.  sortbug is a script file showing the usage described above, dataset is the dataset. 
here is the version I am running.

sort (GNU coreutils) 8.21
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.

Thanks,
Nathan


Attachment: dataset
Description: Binary data

Attachment: sortbug
Description: Binary data


--- End Message ---
--- Begin Message --- Subject: Re: bug#18073: defect with sort multiple arguments Date: Mon, 21 Jul 2014 15:13:05 -0600 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
tag 18073 notabug
thanks

On 07/21/2014 01:57 PM, n buckner wrote:
> I was seeing some odd behaviour with sort -n -u.  I ran sort -n -u dataset
> and expected the same output as sort -n dataset| uniq but instead got
> something different.  sortbug is a script file showing the usage described
> above, dataset is the dataset.
> here is the version I am running.
> 
> sort (GNU coreutils) 8.21

Thanks for the report.  However, the problem is not in sort, but in your
usage of the command line parameters to sort.  Let's use the --debug
flag to see what is REALLY going on:

$ sort -n -u dataset --debug
sort: using ‘en_US.UTF-8’ sorting rules
2012-09-07 (Srikrishna Bodanapu
____
2013-06-15 (Chetana Nair
____
2014-02-24 (Subba Juturi
____

Aha - sort's -u says to declare lines unique ONLY if they differ on the
sort keys you specified, and disregarding any portion of the line that
didn't match your specified sort keys.  But the sort key you specified,
-n, ends as soon as it hits a non-numeric character.  If you WANT to
sort the entire line, then you need to do something like:

sort -k1,1n -k1 -u dataset

which says to sort _first_ by numeric (which ends on the first non-digit
character of each line), and _second_ by the entire line; and then
filter out for unique lines.  Adding the second key over the entire line
makes the difference that matches what you were seeing with uniq:

$ diff -u <(sort -k1,1n -k1 dataset -u) <(sort -n dataset | uniq)
$

Oh, and if you wanted to sort by all three fields of the date, instead
of just the year, you probably want:

sort -t - -k1,1n -k2,2n -k3,3n -k1 -u dataset

although for the particular dataset you posted, it makes no difference.

I'm closing this as not a bug, but please feel free to reply if you have
further questions.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]