[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: utf-8 input under X11
Re: utf-8 input under X11
Sun, 04 Nov 2001 19:36:40 +0200
> From: Dave Love <address@hidden>
> Date: 04 Nov 2001 15:56:42 +0000
> >>>>> "EZ" == Eli Zaretskii <address@hidden> writes:
> EZ> Since using UTF-8 input automatically means all non-ASCII
> EZ> characters in your buffers belong to the mule-unicode-* character
> EZ> sets, we effectively limit users to files in either UTF-8 or
> EZ> Latin-1. They will not be able, for example, produce KOI8-R or
> EZ> Latin-3,
> I can't believe I'm seeing this. It's no more true for utf-8 encoding
> than it is for latin-2, say. That should be clear even without
> implementing and testing such a language environment, as I did.
What I wrote is true for Emacs 21.1 as released, without any
add-ons. I explicitly said that installing such add-ons invalidates
what I wrote:
> EZ> unless they either install add-on packages such as Mule-UCS or
> EZ> otherwise modify the code which encodes and decodes characters.
The issue was whether to turn the UTF-8 locale on by default, given
the appropriate value of $LANG or similar variables. I think, with
stock Emacs 21.1, we shouldn't.
What else is unclear or unbelievable here? In particular, it should
be clear that if code such as yours, Davem, is added to Emacs, what I
said will no longer hold. How can I ever make myself more clear than
> EZ> This is because currently, mule-unicode-* characters are treated
> EZ> as disjoint from the other character sets supported by Emacs.
> Just like all the other internal charsets. They're not special --
> it's simply misleading to suggest otherwise.
There's a small but significant difference. No one would expect Emacs
to unify characters from, say, iso8859-5 and iso8859-3. By contrast,
characters from iso8859-5 and the Cyrillic part of
mule-unicode-0100-24ff _will_ be expected to be the same characters.
> EZ> so perhaps we shouldn't yet force it on the user.
> I hope that isn't the reason, any more than for other locales.
> It isn't forced on the user, anyhow -- they request it by specifying
> the locale.
Setting the locale affects many programs. Users will rightfully
expect Emacs to behave like those other programs, given the same
locale. However, in the case of Emacs 21.1, as shipped,, the effect
might surprise them.
> People should understand that the utf-8 coding system is essentially
> the same as any other CCL-based one. Assuming anything else, e.g. in
> Gnus, just causes lossage.
> There is too much FUD flying around.
I don't think that by explaining the facts and making our decisions
clear to users we can add any FUD. I'd expect that to have the
opposite effect. Users should understand the trade-offs when they
decide how to set up their systems.