bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort "b" option in pos2 has strange effect


From: Pádraig Brady
Subject: Re: sort "b" option in pos2 has strange effect
Date: Tue, 24 Feb 2009 11:01:29 +0000
User-agent: Thunderbird 2.0.0.6 (X11/20071008)

Davide Canova wrote:
> It seems to be doing as you describe, plus if a "b" option is used in
> POS2, it also eats the leading blanks in the field after (POS2 field if
> ".0" is specified, POS2 field + 1 if ".0" is implied):
> 
> $ sort -k2b,3.0b
> a a  b
> z a a
> ^D
> z a a
> a a  b
> 
> The location of a field-end is not affected by whether initial blanks
> are skipped, therefore a "b" option in POS2 should have some effect only
> if a non-zero '.c' character position is provided.
> 
>> I don't know what's going on exactly thought as
>> as I don't know what's expected. It certainly seems buggy.
> 
> I tried to omit the ".0" AND the "b" option in POS2 in all our examples
> and I think what I get is the expected behavior. Specifying them
> shouldn't change anything.

I think the attached patch should fix this issue up.

cheers,
Pádraig.
>From 3ca0151f2761ae0cbef14d3b4d36c183337ed6f7 Mon Sep 17 00:00:00 2001
From: =?utf-8?q?P=C3=A1draig=20Brady?= <address@hidden>
Date: Tue, 24 Feb 2009 08:37:18 +0000
Subject: [PATCH] sort: Fix a couple of bugs with determining end of fields

Issue reported by Davide Canova <address@hidden>

* src/sort.c: When no specific number of chars is specified
to skip in the end field, always skip the whole field.
Also never include leading spaces from next field.
* tests/misc/sort: Add 2 new tests for these cases.
---
 src/sort.c      |   36 +++++++++++++++++++-----------------
 tests/misc/sort |    5 +++++
 2 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/src/sort.c b/src/sort.c
index f438563..a5416f9 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -1412,6 +1412,9 @@ limfield (const struct line *line, const struct keyfield 
*key)
   size_t eword = key->eword, echar = key->echar;
   size_t remaining_bytes;
 
+  if (echar == 0)
+    eword++; /* Skip all of end field.  */
+
   /* Move PTR past EWORD fields or to one past the last byte on LINE,
      whichever comes first.  If there are more than EWORD fields, leave
      PTR pointing at the beginning of the field having zero-based index,
@@ -1487,19 +1490,21 @@ limfield (const struct line *line, const struct 
keyfield *key)
     }
 #endif
 
-  /* If we're ignoring leading blanks when computing the End
-     of the field, don't start counting bytes until after skipping
-     past any leading blanks. */
-  if (key->skipeblanks)
-    while (ptr < lim && blanks[to_uchar (*ptr)])
-      ++ptr;
+  if (echar != 0) /* We need to skip over a portion of the end field.  */
+    {
+      if (key->skipeblanks) /* blanks not counted in echar.  */
+        {
+          while (ptr < lim && blanks[to_uchar (*ptr)])
+            ++ptr;
+        }
 
-  /* Advance PTR by ECHAR (if possible), but no further than LIM.  */
-  remaining_bytes = lim - ptr;
-  if (echar < remaining_bytes)
-    ptr += echar;
-  else
-    ptr = lim;
+      /* Advance PTR by ECHAR (if possible), but no further than LIM.  */
+      remaining_bytes = lim - ptr;
+      if (echar < remaining_bytes)
+        ptr += echar;
+      else
+        ptr = lim;
+    }
 
   return ptr;
 }
@@ -3152,12 +3157,9 @@ main (int argc, char **argv)
                  badfieldspec (optarg, N_("field number is zero"));
                }
              if (*s == '.')
-               s = parse_field_count (s + 1, &key->echar,
-                                      N_("invalid number after `.'"));
-             else
                {
-                 /* `-k 2,3' is equivalent to `+1 -3'.  */
-                 key->eword++;
+                 s = parse_field_count (s + 1, &key->echar,
+                                        N_("invalid number after `.'"));
                }
              s = set_ordering (s, key, bl_end);
            }
diff --git a/tests/misc/sort b/tests/misc/sort
index 3e8eda6..3e34f30 100755
--- a/tests/misc/sort
+++ b/tests/misc/sort
@@ -110,6 +110,7 @@ my @Tests =
 ["07b", '-k 2,3', {IN=>"a a b\nz a a\n"}, {OUT=>"z a a\na a b\n"}],
 ["07c", '-k 2,3', {IN=>"y k b\nz k a\n"}, {OUT=>"z k a\ny k b\n"}],
 ["07d", '+1 -3', {IN=>"y k b\nz k a\n"}, {OUT=>"z k a\ny k b\n"}],
+["07e", '-k 2,3.0', {IN=>"a a b\nz a a\n"}, {OUT=>"z a a\na a b\n"}],
 #
 # report an error for `.' without following char spec
 ["08a", '-k 2.,3', {EXIT=>2},
@@ -210,6 +211,10 @@ my @Tests =
 # key start and key end.
 ["18e", '-nb -k1.1,1.2', {IN=>" 901\n100\n"}, {OUT=>"100\n 901\n"}],
 
+# When ignoring leading blanks for end position, ensure blanks from
+# next field are not included in the sort. I.E. order should not change here.
+["18f", '-k1,1b', {IN=>"a  y\na z\n"}, {OUT=>"a  y\na z\n"}],
+
 # This looks odd, but works properly -- 2nd keyspec is never
 # used because all lines are different.
 ["19a", '+0 +1nr', {IN=>"b 2\nb 1\nb 3\n"}, {OUT=>"b 1\nb 2\nb 3\n"}],
-- 
1.5.3.6


reply via email to

[Prev in Thread] Current Thread [Next in Thread]