coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Question about uniq's treatment of spaces-only lines


From: Sudarshan S Chawathe
Subject: Re: Question about uniq's treatment of spaces-only lines
Date: Sun, 31 Jul 2022 12:26:22 -0400


On 2022-07-30T13:25:34+0100 (Saturday), Pádraig Brady writes:
>
> More succinctly:
>
>    $ printf '%s\n' first blah ' ' '  ' 'l ast' | uniq -f1
>    first
>    l ast
>
> I.e. skipping one field will compare all but the 'l ast' line as equal.
> This is operating as per the POSIX standard which states:
>
> "Ignore the first fields fields on each input line when doing comparisons,
> where fields is a positive decimal integer. A field is the maximal string
> matched by the basic regular expression:
>
> [[:blank:]]*[^[:blank:]]*
>
> If the fields option-argument specifies more fields than appear on an input l
> ine,
> a null string shall be used for comparison."

Thank you for the clarification.  For me, the key to resolving my
earlier confusion was the realization that the blanks are included in
the field as opposed to being interpreted as inter-field separators.
This is obvious now based on what you quote above from the POSIX docs
but escaped me earlier because I hadn't thought of checking those
docs. The GNU info docs for uniq do not seem to describe what exactly a
field is in this context.  Perhaps it would be useful to include the
above quote or an equivalent description (or pointer) there.

Regards,

-chaw



reply via email to

[Prev in Thread] Current Thread [Next in Thread]