[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: letter occurence in a text
From: |
gregory benison |
Subject: |
Re: letter occurence in a text |
Date: |
Wed, 23 May 2012 06:57:38 -0700 |
On Tue, May 22, 2012 at 7:26 AM, nrichard <address@hidden> wrote:
>
> hello my problem is to count occurence of letter in a text and come out with
> an assoc-list like
> '((a.16) (b.10) ... (z.5))
Why store the alist keys as one-character strings, rather than just as
characters? Storing as characters would be simpler:
;; Given an alist 'lst' containing character counts and character 'c',
;; return an alist with the count of 'c' incremented (set to 1 if it
doesn't exist).
(define (lettre-test c lst)
(let ((current (assoc-ref lst c)))
(assoc-set! lst c (+ 1 (or current 0)))))
Character frequency analysis can be performed with a "fold" operation:
> (fold lettre-test '() (string->list "hello, world!"))
((#\! . 1)
(#\d . 1)
(#\r . 1)
(#\w . 1)
(#\space . 1)
(#\, . 1)
(#\o . 2)
(#\l . 3)
(#\e . 1)
(#\h . 1))
I think it would be best to separate the filtering for alphabetic
chars from the "lettre-test" function; they're separate ideas:
> (fold lettre-test '() (filter char-alphabetic? (string->list "hello,
> world!")))
((#\d . 1)
(#\r . 1)
(#\w . 1)
(#\o . 2)
(#\l . 3)
(#\e . 1)
(#\h . 1))
The drawback of this solution, as currently written, is that it can't
lazily read a file; you'd have to read the entire file into a string
first. It should be possible to modify this to use streams rather
than lists, though.
--
Greg Benison <address@hidden>
[blog] http://gcbenison.wordpress.com
[twitter] @gcbenison