[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness
From: |
Eli Zaretskii |
Subject: |
Re: Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness |
Date: |
Tue, 24 May 2005 22:14:55 +0300 |
> From: Olaf Klischat <address@hidden>
> Date: Tue, 24 May 2005 13:42:41 +0200
> Cc:
>
> http://user.cs.tu-berlin.de/~klischat/emacs-i18n-broken-by-design.png
>
> I.e. several instances of the german umlaut "ΓΌ" in a buffer, some of
> which are found by isearch, while others aren't.
I think these problems are solved in the current CVS, and will go away
completely once a Unicode based Emacs is released (don;t ask me when,
but there's a CVS branch where people actively work on this).
> Looks like a design error to me -- it should
> store buffer contents internally as a sequence of Unicode codepoints,
> not as sequences of bytes + encoding (which is what I presume it
> does atm).
Historically, the multilingual Emacs was based on an encoding other
than Unicode, where Latin-n character sets don't intersect.
> When running under that locale, the "man" program (or is it nroff, or
> troff, or groff?), for reasons that are beyond me, decides to turn the
> perfectly valid ASCII chracter 0x27 ("'", U+0027 APOSTROPHE) into the
> UTF-8 sequence 0xe2 0x80 0x99 [1], which, according to
> http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder,
> is the chracter U+2019 RIGHT SINGLE QUOTATION MARK (similar things
> happen with the "-" chracter, and probably others).
I'm guessing that Groff automatically uses the UTF-8 encoding and
passes the -Tutf8 option to the TTY driver.
> I don't know who is to blame for all this. Are those automatic
> character conversions mandated by some standard?
Some of them. You should read the manuals and complain to the
respective maintainers.
> All things considered, it seems that it is still quite impossible (or
> should one say "adventurous"?) to use GNU and Emacs for programming
> tasks under multibyte encodings.
Some of the problems you mention have nothing to do with Emacs.
Anyway, thanks for the reports.
- Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness, Olaf Klischat, 2005/05/05
- Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness, Olaf Klischat, 2005/05/24
- Re: Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness,
Eli Zaretskii <=
- Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness, Olaf Klischat, 2005/05/24
- Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness, Olaf Klischat, 2005/05/24
- Prev by Date:
Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness
- Next by Date:
Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness
- Previous by thread:
Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness
- Next by thread:
Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness
- Index(es):