|
From: | GNU bug Tracking System |
Subject: | [debbugs-tracker] bug#26422: closed (historical feature or grand daddy bug?) |
Date: | Sun, 09 Apr 2017 19:05:02 +0000 |
Your message dated Sun, 9 Apr 2017 12:04:34 -0700 with message-id <address@hidden> and subject line Re: bug#26422: historical feature or grand daddy bug? has caused the debbugs.gnu.org bug report #26422, regarding historical feature or grand daddy bug? to be marked as done. (If you believe you have received this mail in error, please contact address@hidden) -- 26422: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=26422 GNU Bug Tracking System Contact address@hidden with problems
--- Begin Message ---Subject: historical feature or grand daddy bug? Date: Sun, 9 Apr 2017 11:37:34 -0700 However, to stringsTabs should be first?LFÂ ASCII value is 10.output earlier than lines which begin with tab.the lines which start with line feedBy the sort programwhen a file is sorted
Tab ASCII value is 9.
if the lines are converted
then to mitigate a larger address spacepresumably with 0 the LF are replaced.Yet after the LF if the 0 byte was placedthen the expected output would become.If expected behavior becomes
then historical behavior relied upon scripts might break.The sort.c source code was not viewed.
Therefore, a patch is not offered.Discussion is solicited.Concerning empty lines first.Is it a bug?Should it be fixed?Because I am not on the email list;if the topic is worth discussionThanks for maintaining and sharing awesome software.
if a decision is made
then please forward.
--- End Message ---
--- Begin Message ---Subject: Re: bug#26422: historical feature or grand daddy bug? Date: Sun, 9 Apr 2017 12:04:34 -0700 Historically, 'sort' ignored the \n at the end of each line, so that empty lines (i.e., lines consisting only of a single \n) collated before all other lines. An earlier version of the POSIX spec was (mis)written to require treating the \n as part of the data, and during development in 1999 GNU sort was briefly changed to conform to that, but this was an error in the POSIX spec that was eventually fixed and GNU sort was changed back to the traditional behavior, before any release was made with the funky behavior. User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 So, it's not a bug that \t\n collates after \n, since "\t" is lexicographically after "".As I understand it, the empty string should collate before all other strings in all POSIX locales, so empty lines should always sort first in 'sort' output. I'm by no means a collation expert, though, and if I'm wrong I'd like to see a counterexample.Come to think of it, 'sort' might be able to improve performance in the common case of sorting text files containing many empty lines, by merely counting the lines rather than storing them internally. I suppose this is a different topic, though.
--- End Message ---
[Prev in Thread] | Current Thread | [Next in Thread] |