[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] UTF8 egg case conversion problem
From: |
Alex Shinn |
Subject: |
Re: [Chicken-users] UTF8 egg case conversion problem |
Date: |
Tue, 4 Jan 2011 11:37:42 +0900 |
On Tue, Jan 4, 2011 at 8:24 AM, Mehmet Köse <address@hidden> wrote:
>
> utf8-string-downcase cannot convert "Latin Letter With Dot Above"
> correctly. (U-0130, only Turkish has this letter afaik.) This one-line
> patch fixes the problem for me:
Teşekkür ederim! I think this was an error from using the
same code for upcase and downcase.
> - ((#x0130) (if (lang? opt "tr" "az") #\I dotted-small-i))
> + ((#x0130) (if (lang? opt "tr" "az") #\i dotted-capital-i))
This is char-downcase, so both #\I and dotted-capital-i
are wrong. I think it should be
((#x0130) (if (lang? opt "tr" "az") #\i dotted-small-i))
That is, in Turkish the dotted capital I always downcases
to normal i. In other locales, dotted capital I isn't even a
character, so it should follow the rules for the Unicode
locale independent lowercasing, which require "dotted-small-i"
to preserve canonical equivalence. "dotted-small-i"
is just small i with a combining dot above, which when
uppercased becomes capital I with combining dot above
(without the combining dot it uppercases to normal capital I,
losing the dot).
I'll double check some of these cases and release a new
version shortly.
--
Alex