Re: how to change file coding system

From: Peter Dyballa
Subject: Re: how to change file coding system
Date: Fri, 19 Aug 2005 15:24:42 +0200

Am 19.08.2005 um 12:20 schrieb Martin Monsorno:

Peter Dyballa <address@hidden> writes:

I have GNU Emacs 21.3.50 from CVS in rare use too.

You have 21.3.50 from CVS?  I have a stable 21.4.1 version!  But I
tried out 22.0.50 from cvs and: it NEVER does show me an 'ü'.
Instead it always displays the code: \374 in the latin-1 file, and
\303\274 in the UTF-8 file.

What about a nice fontset?

(create-fontset-from-fontset-spec "-adobe-courier-medium-r-*-*-11-*-*-*-*-*-fontset-11pt_adobe_courier" t 'noerror) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-1 '("adobe-courier" . "iso8859-1")) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-2 '("adobe-courier" . "iso8859-2")) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-3 '("adobe-courier" . "iso8859-3")) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-4 '("adobe-courier" . "iso8859-4")) (set-fontset-font "fontset-11pt_adobe_courier" 'greek-iso8859-7 '("adobe-couriergr" . "iso8859-7")) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-9 '("adobe-courier" . "iso8859-9")) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-14 '("adobe-courier" . "iso8859-14")) (set-fontset-font "fontset-11pt_adobe_courier" 'latin-iso8859-15 '("adobe-courier" . "iso8859-15")) (set-fontset-font "fontset-11pt_adobe_courier" 'mule-unicode-0100-24ff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-11pt_adobe_courier" 'mule-unicode-2500-33ff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-11pt_adobe_courier" 'mule-unicode-e000-ffff '("adobe-courier" . "iso10646-1")) (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0370) (decode-char 'ucs #x03cf)) '("courier new" . "iso10646-1")) ; Greek (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x03d0) (decode-char 'ucs #x03ff)) '("lucida sans typewriter" . "iso10646-1")) ; Coptic (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0400) (decode-char 'ucs #x04ff)) '("lucida sans typewriter" . "iso10646-1")) ; Cyrillic (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0500) (decode-char 'ucs #x052f)) '("lucida sans typewriter" . "iso10646-1")) ; Cyrillic Supplement (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0530) (decode-char 'ucs #x058f)) '("aramian unicode" . "iso10646-1")) ; Armenian (sylfaen (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0590) (decode-char 'ucs #x05af)) '("lucida sans typewriter" . "iso10646-1")) ; Hebrew (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x05b0) (decode-char 'ucs #x05ff)) '("courier new" . "iso10646-1")) ; Hebrew (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0600) (decode-char 'ucs #x066f)) '("courier new" . "iso10646-1")) ; Arabic (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0670) (decode-char 'ucs #x06ff)) '("lucida sans typewriter" . "iso10646-1")) ; Arabic (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0700) (decode-char 'ucs #x074f)) '("titus cyberbit basic" . "iso10646-1")) ; Syriac (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0780) (decode-char 'ucs #x07bf)) '("titus cyberbit basic" . "iso10646-1")) ; Thaana (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0900) (decode-char 'ucs #x097f)) '("titus cyberbit basic" . "iso10646-1")) ; Devanagari (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0980) (decode-char 'ucs #x09ff)) '("code2000" . "iso10646-1")) ; Bengali (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0a00) (decode-char 'ucs #x0a7f)) '("code2000" . "iso10646-1")) ; Gurmukhi (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0a80) (decode-char 'ucs #x0aff)) '("code2000" . "iso10646-1")) ; Gujarati (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0b00) (decode-char 'ucs #x0b7f)) '("code2000" . "iso10646-1")) ; Oriya (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0b80) (decode-char 'ucs #x0bff)) '("code2000" . "iso10646-1")) ; Tamil (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0c00) (decode-char 'ucs #x0c7f)) '("code2000" . "iso10646-1")) ; Telugu (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0c80) (decode-char 'ucs #x0cff)) '("code2000" . "iso10646-1")) ; Kannada (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0d00) (decode-char 'ucs #x0d7f)) '("code2000" . "iso10646-1")) ; Malayalam (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0d80) (decode-char 'ucs #x0dff)) '("akshar unicode" . "iso10646-1")) ; Sinhala (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0e00) (decode-char 'ucs #x0e7f)) '("code2000" . "iso10646-1")) ; Thai (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0e80) (decode-char 'ucs #x0eff)) '("code2000" . "iso10646-1")) ; Lao (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x0f00) (decode-char 'ucs #x0fff)) '("xtashi" . "iso10646-1")) ; Tibetan (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1000) (decode-char 'ucs #x109f)) '("code2000" . "iso10646-1")) ; Myanmar (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x10a0) (decode-char 'ucs #x10ff)) '("everson mono unicode" . "iso10646-1")) ; Georgian (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1100) (decode-char 'ucs #x11ff)) '("code2000" . "iso10646-1")) ; Hangul Jamo (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1200) (decode-char 'ucs #x137f)) '("ethiopia jiret" . "iso10646-1")) ; Ethiopic (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x13a0) (decode-char 'ucs #x13ff)) '("everson mono unicode" . "iso10646-1")) ; Cherokee (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1400) (decode-char 'ucs #x167f)) '("everson mono unicode" . "iso10646-1")) ; Canadian Aboriginal Syllabics (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1680) (decode-char 'ucs #x169f)) '("everson mono unicode" . "iso10646-1")) ; Ogham (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x16a0) (decode-char 'ucs #x16ff)) '("everson mono unicode" . "iso10646-1")) ; Runic ; (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1700) (decode-char 'ucs #x171f)) '("code2000" . "iso10646-1")) ; Tagalog ; (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1720) (decode-char 'ucs #x173f)) '("code2000" . "iso10646-1")) ; Hanunoo (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1740) (decode-char 'ucs #x175f)) '("code2000" . "iso10646-1")) ; Buhid ; (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1760) (decode-char 'ucs #x177f)) '("code2000" . "iso10646-1")) ; Tagbanwa (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1780) (decode-char 'ucs #x17ff)) '("code2000" . "iso10646-1")) ; Khmer (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1800) (decode-char 'ucs #x18af)) '("code2000" . "iso10646-1")) ; Mongolian (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1900) (decode-char 'ucs #x194f)) '("code2000" . "iso10646-1")) ; Limbu (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1950) (decode-char 'ucs #x197f)) '("tai le valentinium" . "iso10646-1")) ; Tai Le (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x19e0) (decode-char 'ucs #x19ff)) '("cdt khmer" . "iso10646-1")) ; Khmer Symbols (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1d00) (decode-char 'ucs #x1d7f)) '("everson mono unicode" . "iso10646-1")) ; Phonetic Extensions (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1e00) (decode-char 'ucs #x1eff)) '("courier" . "iso10646-1")) ; Latin Extended Additional (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x1f00) (decode-char 'ucs #x1fff)) '("everson mono unicode" . "iso10646-1")) ; Greek Extended (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2000) (decode-char 'ucs #x206f)) '("everson mono unicode" . "iso10646-1")) ; General Puctuation (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2070) (decode-char 'ucs #x209f)) '("everson mono unicode" . "iso10646-1")) ; Superscripts and Subscripts (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x20a0) (decode-char 'ucs #x20cf)) '("everson mono unicode" . "iso10646-1")) ; Currency Symbols (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x20d0) (decode-char 'ucs #x20ff)) '("everson mono unicode" . "iso10646-1")) ; Combining Marks for Symbols (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2100) (decode-char 'ucs #x214f)) '("everson mono unicode" . "iso10646-1")) ; Letterlike Symbols ; (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2150) (decode-char 'ucs #x218f)) '("courier new" . "iso10646-1")) ; Number Forms (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2190) (decode-char 'ucs #x21ff)) '("code2000" . "iso10646-1")) ; *Arrows (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2200) (decode-char 'ucs #x22ff)) '("code2000" . "iso10646-1")) ; Mathematical Operators (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2300) (decode-char 'ucs #x23ff)) '("code2000" . "iso10646-1")) ; Miscellaneous Technical (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2400) (decode-char 'ucs #x243f)) '("code2000" . "iso10646-1")) ; Control Pictures (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2440) (decode-char 'ucs #x245f)) '("code2000" . "iso10646-1")) ; Optical Character Recognition (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2460) (decode-char 'ucs #x24ff)) '("code2000" . "iso10646-1")) ; Enclosed Alphanumerics (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2500) (decode-char 'ucs #x257f)) '("code2000" . "iso10646-1")) ; Box Drawing (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2580) (decode-char 'ucs #x259f)) '("code2000" . "iso10646-1")) ; Block Elements (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x25a0) (decode-char 'ucs #x25ff)) '("code2000" . "iso10646-1")) ; Geometric Shapes (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2600) (decode-char 'ucs #x26ff)) '("code2000" . "iso10646-1")) ; Miscellaneous Symbols (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2700) (decode-char 'ucs #x27bf)) '("code2000" . "iso10646-1")) ; Dingbats (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x27c0) (decode-char 'ucs #x27ef)) '("code2000" . "iso10646-1")) ; Miscellaneous Math Symbols-A (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x27f0) (decode-char 'ucs #x27ff)) '("code2000" . "iso10646-1")) ; Supplemental Arrows-A (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2800) (decode-char 'ucs #x28ff)) '("code2000" . "iso10646-1")) ; Braille Patterns (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2900) (decode-char 'ucs #x297f)) '("code2000" . "iso10646-1")) ; Supplemental Arrows-B (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2980) (decode-char 'ucs #x29ff)) '("code2000" . "iso10646-1")) ; Miscellaneous Math Symbols-B (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2a00) (decode-char 'ucs #x2aff)) '("code2000" . "iso10646-1")) ; Supplemental Math Operators (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2b00) (decode-char 'ucs #x2bff)) '("code2000" . "iso10646-1")) ; Miscellaneous Symbols and Arrows (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2e80) (decode-char 'ucs #x2eff)) '("code2000" . "iso10646-1")) ; CJK Radicals Supplement (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2f00) (decode-char 'ucs #x2fdf)) '("code2000" . "iso10646-1")) ; Kangxi Radicals (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x2ff0) (decode-char 'ucs #x2fff)) '("code2000" . "iso10646-1")) ; Ideographic Description Characters (set-fontset-font "fontset-11pt_adobe_courier" (cons (decode-char 'ucs #x3000) (decode-char 'ucs #x303f)) '("code2000" . "iso10646-1")) ; CJK Symbols and Punctuation

It looks ugly here ... 'though it needs to be continued!

Specifying the coding system in the headerline doesn't change
anything: buffer-file-coding-system's value remains raw-text-unix.

Then you're doing something the wrong way ... At least my experience (>10a) tells me so. This code never failed for me (and I am no politician):

        -*- coding: iso-8859-15; -*-

And yes, I meant and I wrote C-x RET r. Usually I first view a file's
contents and judge then if I need to change the encoding Emacs uses by

C-h k C-x RET r is undefined
-->  C-x RET r is undefined

C-h b shows, amongst many others:

C-x RET l               set-language-environment
C-x RET c               universal-coding-system-argument
C-x RET C-\     set-input-method
C-x RET X               set-next-selection-coding-system
C-x RET x               set-selection-coding-system
C-x RET p               set-buffer-process-coding-system
C-x RET k               set-keyboard-coding-system
C-x RET t               set-terminal-coding-system
C-x RET F               set-file-name-coding-system
C-x RET r               revert-buffer-with-coding-system
C-x RET f               set-buffer-file-coding-system

BTW: to which value is LC_CTYPE set? LC_ALL? LANG?

Mit friedvollen Grüßen


Hupende Autos beißen nicht.

