bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17637: bug "cut of end-line is skipped"


From: Pádraig Brady
Subject: bug#17637: bug "cut of end-line is skipped"
Date: Fri, 30 May 2014 03:34:06 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 05/29/2014 11:53 PM, Eric Blake wrote:
> On 05/29/2014 04:24 PM, Pádraig Brady wrote:
>> tag 17637 notabug
>> close 17637
>> stop
> 
> On the one hand, this feels a bit premature.
> 
>>
>> That change 
>> http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=51ce0bf8
>> was made in v8.21 to fix http://bugs.gnu.org/13498
> 
> Are you sure you didn't mean the next commit:
> 
> http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=d302aed

Right sorry.

> But both of those commits are in coreutils 8.21, whereas the Fedora 20
> build of coreutils 8.21 does not have that behavior.  Is downstream
> patching things in a way to make it work, and if so, why can't we
> backport what Fedora has added on top?

That's the i18n patch which has diverged here:

  $ seq 10 | LANG=C cut -s -f2 -d$'\n'
  $ seq 10 | cut -s -f2 -d$'\n'
  2

Unfortunately that means we have an inconsistency.
Also many users might still be getting the old behavior
(and thus not complaining about the new behavior)
and Rudy may be hitting this only because the script is
being run in the C locale?

>> It was made for a good reason, to handle the buffering issues detailed
>> in the above bug. Your existing usage was a bit of an edge case and not
>> supported with other cut implementations, and while we try to avoid
>> changes like this it was thought the benefits outweighed the impact
>> for the very few who use cut in this way.
> 
> But while you documented the improved buffering behavior in NEWS, you
> failed to document the corner-case change to -d$'\n'.
> 
> On the other hand, I confirmed that both Solaris and FreeBSD cut behave
> the same way as the new GNU cut behavior.
> 
> $ nl='
> '
> $ printf 'a\t1\nb\t2\n' | cut -d"$nl" -f1
> a       1
> b       2
> 
> So keeping the new behavior in the name of consistency makes sense,
> although it still might be nice to add a retroactive NEWS entry.

Ugh I'm not sure now. Consistency is good if that consistent
behavior is needed, though I suppose the use case of using -s -d$'\n'
to suppress the last line if it has no trailing newline is a lot more
esoteric than using cut like this for example:

  $ seq 10 | cut -f2,3,7 -d$'\n' --output-delimiter='|'
  2|3|7

So I'm leaning towards restoring that behavior.
(I notice cut consumes all input even if the last
line (field) needed is output, so we could improve that too).

I'll sleep on it.

thanks,
Pádraig.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]