|
From: | Paul Eggert |
Subject: | Re: commit-msg hook |
Date: | Mon, 13 Apr 2015 11:37:35 -0700 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 |
On 04/13/2015 08:48 AM, Eli Zaretskii wrote:
Some POSIXish environments do not support unibyte locales; even the C locale is multibyte (it uses UTF-8). In such environments, I expect that a regular expression like /\342\202\254/ won't necessarily match the string "€" (U+20AC, equivalent to the byte string "\342\202\254") in any locale, just as in Emacs the call (re-search-forward "\342\202\254") won't find the string "€" in a UTF-8 file, regardless of language settings.why do we need to rely on system libraries to implement UTF-8 and [:print:] correctly?
I suppose the script could detect whether we're in such an environment, but I dunno, it sounds like more trouble than it's worth.
[Prev in Thread] | Current Thread | [Next in Thread] |