[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: horrible utf-8 performace in wc
From: |
Bo Borgerson |
Subject: |
Re: horrible utf-8 performace in wc |
Date: |
Wed, 07 May 2008 07:41:24 -0400 |
User-agent: |
Thunderbird 2.0.0.12 (X11/20080227) |
Pádraig Brady wrote:
> canonically équivalent
> canonically équivalent
>
> Pádraig.
>
> p.s. I Notice that gnome-terminal still doesn't handle
> combining characters correctly, and my mail client thunderbird
> is putting the accent on the q rather than the e, sigh.
They both render correctly here (Thunderbird 2.0.0.12).
Is there a good library for combining-character canonicalization
available? That seems like something that would be useful to have in a
lot of text-processing tools. Also, for Unicode, something to shuffle
between the normalization forms might be helpful for comparisons.
I may be misinterpreting your patch, but it seems to me that
decrementing count for zero-width characters could potentially lead to
confusion. Not all zero-width characters are combining characters, right?
Bo
- horrible utf-8 performace in wc, Jan Engelhardt, 2008/05/06
- Re: horrible utf-8 performace in wc, Pádraig Brady, 2008/05/07
- Re: horrible utf-8 performace in wc,
Bo Borgerson <=
- Re: horrible utf-8 performace in wc, Jim Meyering, 2008/05/07
- Re: horrible utf-8 performace in wc, Bo Borgerson, 2008/05/07
- Re: horrible utf-8 performace in wc, Pádraig Brady, 2008/05/07
- Re: horrible utf-8 performace in wc, Bo Borgerson, 2008/05/07
- Re: horrible utf-8 performace in wc, Pádraig Brady, 2008/05/07
- Re: horrible utf-8 performace in wc, Bo Borgerson, 2008/05/08
- Re: horrible utf-8 performace in wc, Bruno Haible, 2008/05/08
- Re: horrible utf-8 performace in wc, Pádraig Brady, 2008/05/07
- Re: horrible utf-8 performace in wc, Bruno Haible, 2008/05/08
Re: horrible utf-8 performace in wc, Jan Engelhardt, 2008/05/07