[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: idn2 fails with uppercase (german) umlauts
From: |
Simon Josefsson |
Subject: |
Re: idn2 fails with uppercase (german) umlauts |
Date: |
Thu, 24 Oct 2013 20:09:48 +0200 |
User-agent: |
Gnus/5.130008 (Ma Gnus v0.8) Emacs/24.3 (gnu/linux) |
Tim Ruehsen <address@hidden> writes:
> Hi,
>
> could you give me a hint why
> idn2 ä
> works (translates to xn--4ca), but
> idn2 Ä
> results in 'lookup: string contains a disallowed character' ?
Hi Tim! Uppercase characters are forbidden in IDNA2008, see RFC5894 for
an easier explanation.
There is provision for exception cases but, in general, characters
are placed into DISALLOWED if they fall into one or more of the
following groups:
...
o The character is an uppercase form or some other form that is
mapped to another character by Unicode case folding.
In particular, Ä is U+00c4 which is disallowed by RFC 5892:
00B8..00DE ; DISALLOWED # CEDILLA..LATIN CAPITAL LETTER THORN
> The 'old' idn works in both cases.
Yes, but that's IDNA2003. IDNA2008 is not fully backwards compatible
with IDNA2003
> Is this by purpose (should I convert my utf-8 domains names to lowercase
> before calling libidn2 lookup functions) or will this behaviour be fixed ?
It is on purpose. See RFC5894 for some recommendations:
User interface
programs can meet the expectations of users who are accustomed to the
case-insensitive DNS environment by performing case folding prior to
IDNA processing, but the IDNA procedures themselves should neither
require such mapping nor expect them when they are not natural to the
localized environment.
Unicode has published TR46 which describe one way to perform
IDNA2008-mappping in a way that maximize IDNA2003-compatibility:
http://unicode.org/reports/tr46/
It is not implemented in libidn2 though.
/Simon
- Re: idn2 fails with uppercase (german) umlauts,
Simon Josefsson <=