|
From: | Elias Mårtenson |
Subject: | Re: On language-dependent defaults for character-folding |
Date: | Sat, 20 Feb 2016 18:08:20 +0800 |
Your interpretation is wrong, because every implementation of
character-folding in search uses normalization forms. So if you want
to maintain that whoever does that is abusing normalization forms, you
are not just up against Emacs, you are up against the ICU library and
others. You are also up against http://www.unicode.org/notes/tn5/.
It is possible that you only see the "equivalence" parts of all these
sources. But in that case, you are actually claiming that folding
characters should never be done at all! "Folding" means mapping
_distinct_ character sequences to the same basic sequence. You start
from a normalization form, then compare the results disregarding
certain secondary, tertiary, etc. differences.
> Again (I really apologise for repeating myself, I'm starting to sound like a troll and that is truly not my intention),
> the purpose of normalisation forms are to ensure that the two variants of ñ compare the same. It is not
> designed to provide a mechanism to allow n to compare equal to ñ.
Under character-folding that ignores diacritics, ñ should indeed
compare equal to n.
> Sure, but doesn't it make sense to fall back to the user's default if the buffer does not have an overriding
> locale?
I don't know what you mean by "buffer has an overriding locale".
Emacs buffers don't have a locale, and they cannot do that in
principle because we support multiple languages. E.g., what could the
locale of the HELLO buffer created by "C-h H" be?
> As opposed to having no concept of locale at all?
Yes. A multilingual environment cannot have a locale in principle.
It will cease being multilingual if it does.
> Strange, I always thought the data was there. Perhaps you should ask
> a question on the Unicode mailing list, then.
>
> That's a good idea actually.
That's a relief. I was beginning to suspect I don't have any good
ideas at all.
[Prev in Thread] | Current Thread | [Next in Thread] |