coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFE: uniq --sequential


From: Pádraig Brady
Subject: Re: RFE: uniq --sequential
Date: Wed, 10 Jun 2015 22:32:07 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0

On 10/06/15 22:04, Daiki Ueno wrote:
> Hello,
> 
> I occasionally have to deal with sequential numbers which is largely
> contiguous, but contain gaps (e.g., Unicode code points).
> 
> To detect gaps, I usually write a shell-script loop, which is not
> trivial.  So, I thought that it would be handy if this is supported by
> coreutils, like this:
> 
>   $ { seq 1 10; seq 12 22; seq 26 34; } | uniq --sequential
>   1
>   12
>   26
> 
> or, a more practical use-case:
> 
>   $ wc -l UnicodeData.txt
>   27268 UnicodeData.txt
>   $ cut -f1 -d';' UnicodeData.txt | sed 's/^/0x/' | uniq --sequential | wc -l
>   612
> 
> where contiguous numbers are treated as duplicates.  I'm attaching a
> patch which implements this.

Thanks for the suggestion and especially the patch.
This is related to the merging of sort --key functionality into uniq
in the next major version of coreutils. That will give numeric comparison
functionality to uniq. Then this functionality could be added with
a --sequential[=interval] or maybe a --min-separation=2 option.
It seems like it could be quite useful with the --group option also.

thanks!
Pádraig.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]