[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: utf-8 input under X11

From: Dave Love
Subject: Re: utf-8 input under X11
Date: 04 Nov 2001 15:56:42 +0000
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.0.107

>>>>> "EZ" == Eli Zaretskii <address@hidden> writes:

 EZ> Since using UTF-8 input automatically means all non-ASCII
 EZ> characters in your buffers belong to the mule-unicode-* character
 EZ> sets, we effectively limit users to files in either UTF-8 or
 EZ> Latin-1.  They will not be able, for example, produce KOI8-R or
 EZ> Latin-3,

I can't believe I'm seeing this.  It's no more true for utf-8 encoding
than it is for latin-2, say.  That should be clear even without
implementing and testing such a language environment, as I did.
Experimentally, it didn't stop me editing arbitrarily-encoded files.
Emacs is multilingual.

 EZ> unless they either install add-on packages such as Mule-UCS or
 EZ> otherwise modify the code which encodes and decodes characters.

Or Emacs could just include the file which does that job.

 EZ> And if they mix characters from files encoded in anything but
 EZ> UTF-8 or Latin-1 with what they type, 

mac-roman is a counter-example currently in Emacs.  It's essentially
trivial to define other 8-bit coding systems in terms of Unicode.
I've posted 30+.

 EZ> Emacs will confuse them by refusing to save the result in UTF-8.

You also have the contribution to extend what mule-utf-8 encodes.

 EZ> This is because currently, mule-unicode-* characters are treated
 EZ> as disjoint from the other character sets supported by Emacs.

Just like all the other internal charsets.  They're not special --
it's simply misleading to suggest otherwise.

 EZ> If UTF-8 input is the only reasonable input mode in such locales,
 EZ> then using it would be a lesser evil than any other alternative.
 EZ> But if users can reasonably use other input encoding, we might be
 EZ> preventing them from having a more useful Emacs.

Indeed.  I've worked on a number of Unicode input methods and I can't
input utf-8 directly.

 >> I don't see why the user should have to do 
 >> something special (like set-keyboard-coding-system) if using utf-8, 
 >> but not if using koi8-r !

 EZ> See above: the reason is that Unicode support is not yet complete
 EZ> enough,

[Complete enough for what?]

 EZ> so perhaps we shouldn't yet force it on the user.

I hope that isn't the reason, any more than for other locales.

It isn't forced on the user, anyhow -- they request it by specifying
the locale.  koi8-r support is hardly complete either, but it's
invoked automatically at startup.  (The coding system should be
completed using Unicode characters, or made completely Unicode-based.)

People should understand that the utf-8 coding system is essentially
the same as any other CCL-based one.  Assuming anything else, e.g. in
Gnus, just causes lossage.  There is too much FUD flying around.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]