help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Most used words in current buffer


From: Ben Bacarisse
Subject: Re: Most used words in current buffer
Date: Wed, 18 Jul 2018 23:39:53 +0100

Udyant Wig <udyantw@gmail.com> writes:
<snip>
> they were left behind by this old Awk solution (also using hashing) I
> found in the classic /The Unix Programming Environment/ by Kernighan and
> Pike:
>
> ---
> #!/bin/sh
>
> awk '    { for (i = 1; i <= NF; i++) num[$i]++ }
> END      { for (word in num) print word, num[word] }
> ' $* | sort +1 -nr | head -10 | awk '{ print $1 }'
> ---
>
> I appended the last awk pipeline to only give the words without the
> counts.

The Unix command cut does this task.  Nothing wrong with using another
awk, but I often feel sorry for poor old cut.  It's been around for
decades, and yet is so very often overlooked!  Mind you, it uses TABs to
delimit fields by default, so maybe it only has itself to blame.

-- 
Ben.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]