[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: detect big5, utf8, gb2312 as good as firefox

From: Kevin Rodgers
Subject: Re: detect big5, utf8, gb2312 as good as firefox
Date: Wed, 01 Mar 2006 09:16:31 -0700
User-agent: Mozilla Thunderbird 0.9 (X11/20041105)

Dan Jacobson wrote:
Gentlemen, with three plain text files of coded in my most commonly
encountered coding systems, I did this test:
$ for c in zh_CN.gb2312 zh_TW.utf8 zh_TW.big5;
do LC_CTYPE=$c LANG=$c LC_ALL=$c emacs -q gb2312 utf8 big5; done
Well, zh_CN.gb2312 guessed right each time!
With zh_TW.utf8, emacs guessed the big5 and gb files were latin-1, as
seen by the 1 in the modeline and the jumble on the screen.
With zh_TW.big5, the gb2312 file was seen jumbled as type big5.
In .emacs I can do
(set-language-environment "UTF-8")
(prefer-coding-system 'utf-8-unix)
(set-coding-priority ;So that big5 is still guessed right after utf-8.
 (reverse ;Found these lisp thingies and it works.
   (reverse;no lisp pro me
    (append(list 'coding-category-utf-8
to detect all but gb2312 OK. What should I do, make my whole
environment CN even though I only visit those kind of files once a
week, and plan to live in UTF-8 / big5 land ... BTW, firefox guessed
right each time even though they were plain text files with no
charset= hints.

I would instead try this first:

;; From least- to most-preferred:
(prefer-coding-system 'gb2312)
(prefer-coding-system 'big5)
(prefer-coding-system 'utf-8)

If you insist on frobbing coding-category-list:

 ;; Insert big5 after utf-8:
 (apply 'nconc
        (mapcar (lambda (coding-category)
                  (if (eq coding-category 'coding-category-utf-8)
                      (list 'coding-category-utf-8 'coding-category-big5)
                    (list coding-category)))

Note that you can replace (apply 'nconc (mapcar ...)) with
(require 'cl)(mapcan ...)

Do you get better or worse results with those approaches?

Kevin Rodgers

reply via email to

[Prev in Thread] Current Thread [Next in Thread]