bug#13936: Default to UTF-8 for most Emacs source files

bug-gnu-emacs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13936: Default to UTF-8 for most Emacs source files

From:	Paul Eggert
Subject:	bug#13936: Default to UTF-8 for most Emacs source files
Date:	Wed, 20 Mar 2013 09:43:38 -0700
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4

On 03/20/13 01:18, Kenichi Handa wrote:
> Among CJK files, I think K(orean) files can be in UTF-8
> without problem.

It's easy enough to convert the K files to UTF-8 too, and I'll propose
a patch to do that in followup email.

> Are there any people familiar with Korean situation?

Sorry, I don't know.

For what it's worth, when I use Emacs to convert TUTORIAL.ko to
UTF-8 and back, the result is identical to the original, so no
information is lost by making that change.  (This is not true
for TUTORIAL.ja.)

I have another question.  Shouldn't it be OK to convert Elisp source
files such as leim/quail/japanese.el to UTF-8 as well?  Emacs
internally converts their text to UTF-8 while compiling them, so the
corresponding .elc files are in UTF-8 already, and there should be no
functional difference if we convert the .el files to UTF-8.

Converting these files to UTF-8 would fix an inconsistency in Emacs
behavior.  For example, if I visit the file leim/quail/japanese.el I see
this definition:

  (defvar quail-japanese-use-double-n nil
    "If non-nil, use type \"nn\" to insert ん.")

where the character 'ん' is displayed using code point 0x2473 in
charset japanese-jisx0208.  But if I *use* the above definition string,
by typing "C-h v quail-japanese-use-double-n RET", the help string
that I see has been translated to UTF-8, so Emacs displays that
character using code point 0x3093 in charset unicode instead.  It
would be better if the runtime behavior matched the source code, and
an easy way to do that would be to convert the source code to UTF-8.

Here is the list of the remaining .el files that I'd like to convert
to UTF-8:

        leim/quail/cyril-jis.el
        leim/quail/hanja-jis.el
        leim/quail/japanese.el
        leim/quail/py-punct.el
        leim/quail/pypunct-b5.el
        lisp/international/ja-dic-cnv.el
        lisp/international/ja-dic-utl.el
        lisp/international/kinsoku.el
        lisp/international/kkc.el
        lisp/international/titdic-cnv.el
        lisp/language/japan-util.el
        lisp/language/japanese.el
        lisp/term/x-win.el

x-win.el is a special case, since it has two "Kana: Fixme:" lines
talking about problems when converting to UTF-8 -- evidently these are
issues in our current setup anyway since Emacs converts the text to UTF-8
before compiling it.

[Prev in Thread]

Current Thread

[Next in Thread]

bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert, 2013/03/12
- bug#13936: Default to UTF-8 for most Emacs source files, Andreas Schwab, 2013/03/12
  - bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert, 2013/03/12
  - bug#13936: Default to UTF-8 for most Emacs source files, handa, 2013/03/13
    - bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert, 2013/03/18
    - bug#13936: Default to UTF-8 for most Emacs source files, Andreas Schwab, 2013/03/18
    - bug#13936: Default to UTF-8 for most Emacs source files, Stefan Monnier, 2013/03/18
    - bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert, 2013/03/18
    - bug#13936: Default to UTF-8 for most Emacs source files, Kenichi Handa, 2013/03/20
    - bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert <=
    - bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert, 2013/03/20
    - bug#13936: Default to UTF-8 for most Emacs source files, Paul Eggert, 2013/03/18
- bug#13936: Default to UTF-8 for most Emacs source files, Eli Zaretskii, 2013/03/12

Prev by Date: bug#14009: 24.3; 2 instances of modeline instead of mode-line
Next by Date: bug#13936: Default to UTF-8 for most Emacs source files
Previous by thread: bug#13936: Default to UTF-8 for most Emacs source files
Next by thread: bug#13936: Default to UTF-8 for most Emacs source files
Index(es):
- Date
- Thread