[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Usage of standard-display-table in MSDOS
From: |
Kenichi Handa |
Subject: |
Re: Usage of standard-display-table in MSDOS |
Date: |
Mon, 06 Sep 2010 14:14:01 +0900 |
In article <address@hidden>, "Ehud Karni" <address@hidden> writes:
> I attach a tar.bz2 file with 3 files:
> 1. lit1 - the sample file.
> 2. lit1-tty.png - how it should show on text terminal.
> 3. lit1-x.png - how it should show on X.
> I can do it if I read the file with the iso-latin-1 coding-system
> and change the display table to show the Hebrew glyphs for the Hebrew
> [#xE0-#xFA] bytes. But in this way it is not Hebrew characters (e.g.
> for the new bidi display). I want it the other way around, to read it
> with hebrew-iso-8bit and to to tweak the display table to show all
> the bytes not belonging to the Hebrew set.
Does it mean that you want bidi-reordering for the bytes
#xE0..#xFA (code-points of iso-8859-8) but bidi-reordering
is not necessary for the bytes #x80..#x8A (code-points of
cp862)?
But, your file "lit1" contains #xE0..#xFA (code-points of
iso-8859-8) at the second to 4th lines in visual order. If
bidi-reordering is applied on them, you'll get the different
view than lit1-tty.png and lit1-x.png. Is that ok?
> I had similar problem a long time ago. In 2001 you suggested to use
> the following code:
> (make-coding-system
> 'hebrew-iso-8bit 2 ?8
> "ISO 2022 based 8-bit encoding for Hebrew (MIME:ISO-8859-8)"
> '(ascii hebrew-iso8859-8 nil nil
> nil ascii-eol ascii-cntl nil nil nil nil nil t)
> '((safe-charsets ascii hebrew-iso8859-8 eight-bit-control)
> (mime-charset . iso-8859-8)))
> May be I can define a new coding system that will have bytes #x80-#xFF
> as legal characters and be recognized as Hebrew variant.
This code will that. I think it's not difficult to
understand what the code is doing.
------------------------------------------------------------
(define-charset 'cp862-sub
"Subset of CP862"
:code-space [#x80 #xDF]
:subset '(cp862 #x80 #xDF #x00))
(define-charset 'iso-8859-8-sub
"Subset of ISO-8859-8"
:code-space [#xE0 #xFA]
:subset '(iso-8859-8 #xE0 #xFA #x00))
(define-coding-system 'mix-hebrew
"Mixture of ISO-8859-8 and CP862"
:mnemonic ?H
:coding-type 'charset
:charset-list '(ascii iso-8859-8-sub cp862-sub)
:ascii-compatible-p t)
------------------------------------------------------------
Please try C-x C-m c mix-hebrew RET lit1 RET.
But, if you do that, you must consider the problem Eli wrote:
In article <address@hidden>, Eli
Zaretskii <address@hidden> writes:
> But if you want all the Hebrew characters to be treated by Emacs as
> such (e.g., for bidi display), no matter what's their encoding in the
> file, you will have to define a coding-system that will decode them
> all into Unicode codepoints of Hebrew characters. There's a problem
> you will need to solve for defining such a coding system: it has 2
> different encodings for the same character, one from hebrew-iso-8bit,
> the other from cp862. So you will need to decide how will Hebrew
> characters be encoded when the file is saved.
In the above definition of mix-hebrew, as iso-8859-8-sub is
listed before cp862-sub, all Hebrew characters are encoded
into bytes #xE0..#xFA even if they were originally decoded
from bytes #x80..#x9A.
If you don't like it, you must give up decoding bytes
#x80..#x9A into Hebrew chars. You decode them as raw-bytes,
and setup a display table to display them as Hebrew chars.
It can be done by this code:
------------------------------------------------------------
(define-charset 'cp862-sub
"Subset of CP862"
:code-space [#x9B #xDF]
:subset '(cp862 #x9B #xDF #x00))
(define-charset 'iso-8859-8-sub
"Subset of ISO-8859-8"
:code-space [#xE0 #xFA]
:subset '(iso-8859-8 #xE0 #xFA #x00))
(define-coding-system 'mix-hebrew
"Mixture of ISO-8859-8, CP862, and raw 8-bit bytes"
:mnemonic ?H
:coding-type 'charset
:charset-list '(ascii iso-8859-8-sub cp862-sub eight-bit)
:ascii-compatible-p t)
(require 'disp-table)
;; Display bytes #x80..#x9A as Hebrew chars (code-points #xE0..#xFA of
;; ISO-8859-8).
(dotimes (i #x1B)
(aset standard-display-table
(unibyte-char-to-multibyte (+ #x80 i))
(vector (decode-char 'iso-8859-8 (+ #xE0 i)))))
------------------------------------------------------------
This display-table setting works also on terminal as far as
you set terminal coding system to mix-hebrew.
---
Kenichi Handa
address@hidden
Re: Usage of standard-display-table in MSDOS, Ehud Karni, 2010/09/07