Re: efficient version of 'sort | uniq -c

bug-coreutils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: efficient version of 'sort | uniq -c | sort -n'?

From:	Matthew Woehlke
Subject:	Re: efficient version of 'sort \| uniq -c \| sort -n'?
Date:	Mon, 21 May 2007 14:03:17 -0500
User-agent:	Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.10) Gecko/20070221 Thunderbird/1.5.0.10 Mnenhy/0.7.4.0

James Youngman wrote:

On 5/21/07, Matthew Woehlke <address@hidden> wrote:

Is there an efficient implementation of 'sort | uniq -c | sort -n'? I
have a 4 GB core file I want to run 'strings' on, and the above is
really slow.


I would suggest that the appropriate factorisation would be

countitems | sort -n

Here, countitems could be "sort" with some options or "uniq" with some
options...

I thought about that, but /maximum/ efficiency is only achievable doingeverything in one go. Anyway I think 'countitems' would still be a bigimprovement; I would do that as 'sort --unique-with-count' (preferablyaliased 'sort -U') since IMO this is a missing feature of 'sort -u'.


--
Matthew
When in doubt, duct tape!

[Prev in Thread]

Current Thread

[Next in Thread]

efficient version of 'sort | uniq -c | sort -n'?, Matthew Woehlke, 2007/05/21
- Re: efficient version of 'sort | uniq -c | sort -n'?, James Youngman, 2007/05/21
  - Re: efficient version of 'sort | uniq -c | sort -n'?, Matthew Woehlke <=
    - Re: efficient version of 'sort | uniq -c | sort -n'?, Philip Rowlands, 2007/05/21
    - Re: efficient version of 'sort | uniq -c | sort -n'?, Matthew Woehlke, 2007/05/21
- Re: efficient version of 'sort | uniq -c | sort -n'?, Paul Eggert, 2007/05/21
  - Re: efficient version of 'sort | uniq -c | sort -n'?, Matthew Woehlke, 2007/05/21

Prev by Date: Re: efficient version of 'sort | uniq -c | sort -n'?
Next by Date: Re: efficient version of 'sort | uniq -c | sort -n'?
Previous by thread: Re: efficient version of 'sort | uniq -c | sort -n'?
Next by thread: Re: efficient version of 'sort | uniq -c | sort -n'?
Index(es):
- Date
- Thread