[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#16581: suggested code simplification in dfa.c
From: |
Aaron Crane |
Subject: |
bug#16581: suggested code simplification in dfa.c |
Date: |
Wed, 29 Jan 2014 14:20:10 +0000 |
Paul Eggert <address@hidden> wrote:
> +/* The following functions exploit the commutativity and associativity of ^,
> + and the fact that X ^ X is zero. POSIX requires that C equals
> + either tolower (C) or toupper (C); if the former, then C ^ tolower (C)
> + is zero so C ^ xor_other (C) equals toupper (C), and similarly
> + for the latter. */
> +
> +/* Return the exclusive-OR of C and C's other case, or zero if C is
> + not a letter that changes case. */
> +
> +static wint_t
> +xor_wother (wint_t c)
> +{
> + return towlower (c) ^ towupper (c);
> +}
[…]
> + if (case_fold)
> {
> + wchar_t xor = xor_wother (wc);
> + if (xor)
> + {
> + addtok_wc (wc ^ xor);
> + addtok (OR);
> + }
I don't think this works for the wide-character case. For example, in
a suitable locale, I'd expect U+01C8 LATIN CAPITAL LETTER L WITH SMALL
LETTER J ("Lj", roughly) to be U+01C7 LATIN CAPITAL LETTER LJ ("LJ")
under towupper(), and U+01C9 LATIN SMALL LETTER LJ ("lj") under
towlower(). This matches the behaviour I can observe with a simple
test program under the en_GB.UTF-8 locale on both Linux and Mac OS.
Since 0x1c7 ^ 0x1c9 == 14, and 0x1c8 ^ 14 == 0x1c6, this means we'd
call addtok_wc(0x1c6), and U+01C6 is LATIN SMALL LETTER DZ WITH CARON,
which isn't a desired character.
--
Aaron Crane ** http://aaroncrane.co.uk/
- bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/28
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/28
- bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/28
- bug#16581: suggested code simplification in dfa.c,
Aaron Crane <=
- bug#16581: suggested code simplification in dfa.c, arnold, 2014/01/29
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/30
- bug#16581: suggested code simplification in dfa.c, arnold, 2014/01/30
- bug#16581: suggested code simplification in dfa.c, Paul Eggert, 2014/01/30
- bug#16581: suggested code simplification in dfa.c, Aharon Robbins, 2014/01/31