[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: horrible utf-8 performace in wc
From: |
Bo Borgerson |
Subject: |
Re: horrible utf-8 performace in wc |
Date: |
Thu, 08 May 2008 09:37:28 -0400 |
User-agent: |
Thunderbird 2.0.0.12 (X11/20080227) |
Bruno Haible wrote:
> If you want wc to count characters after canonicalization, then you can
> invent a new wc command-line option for it. But I would find it more useful
> to have a filter program that reads from standard input and writes the
> canonicalized output to standard output; that would be applicable in many
> more situations.
I like the sound of that!
I suppose the not-yet-implemented gnulib Unicode normalization library
you mentioned in another post would be a prerequisite for such a tool.
I'm definitely interested in helping out here, but I think someone with
a more thorough understanding of Unicode would probably be more useful
(Pádraig?)
Bo
Re: horrible utf-8 performace in wc, Jan Engelhardt, 2008/05/07
Re: horrible utf-8 performace in wc, Jim Meyering, 2008/05/07
Re: horrible utf-8 performace in wc, Bruno Haible, 2008/05/08
Re: horrible utf-8 performace in wc, Bruno Haible, 2008/05/08