[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: eight-bit char handling in emacs-unicode
From: |
Stefan Monnier |
Subject: |
Re: eight-bit char handling in emacs-unicode |
Date: |
18 Nov 2003 12:12:10 -0500 |
User-agent: |
Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 |
>>> The basic problem is that we don't distinguish a character
>>> (code) and a number. So, we introduce a character object
>> That's one way to look at the problem.
>> Another is to say that the problem is instead that we do not distinguish
>> between arrays of chars and arrays of bytes.
> I agree that it's possible to grasp the problem in that way,
> but I'm not sure which is the better way. Could you explain
> WHY yours is better?
I'm not sure whether it's better or worse. The problem I have with the
introduction of a new type for chars is that it is a change that has far
reaching consequences and I'm not sure it would solve all our problems
since many of the problems have to do with bad elisp code.
>> Which of 1 to 3 is the best is not clear, and maybe we can just live with
>> `make-string-unibyte' and `make-string-multibyte'.
> I think you mean string-make-unibyte/multibyte, but, for the
> current problem, we can't use it because string-make-unibyte
> may behave differently in different language environment.
> Such a lang. env. that makes iso-8859-1 or Unicode the
> highest priority for the character `À' is ok.
> (string-make-unibyte (concat '(?a 192))) = "a\300"
> But, if some lang. env. prefers such a charset for `À' that
> encodes it not to 192 (e.g. Vietnamese VSCII), we fail.
No. My `make-string-unibyte' should only work to convert "bytes in
multibyte string" to "bytes in unibyte string": there's no char, thus no
coding-system. If the multibyte string argument contains a char that's
not an eight-bit-char, then it's an error.
To do what your string-make-unibyte does you should use
`encode-coding-string' where the coding system is passed explicitly.
I've changed my Emacs so that string-make-unibyte does the above
(i.e. signals an error if it encounters a non-byte char) and it works fairly
well, except for the few places where the elisp code is sloppy and needs to
be fixed.
>> Note that 1-3 are not mutually exclusive so we can use
>> them all.
> Yes, but, at least, I really want to avoid "(3) Make a
> series of new functions".
(defun concat-unibyte (&rest x)
(make-string-unibyte (apply 'concat x)))
...
so we don't need this series of new functions, but if some of them are used
often enough, we can add them of course.
Stefan
- Re: BIG5-HKSCS?, (continued)
- Re: BIG5-HKSCS?, Simon Josefsson, 2003/11/13
- eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/13
- Re: eight-bit char handling in emacs-unicode, Oliver Scholz, 2003/11/14
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/14
- Re: eight-bit char handling in emacs-unicode, Oliver Scholz, 2003/11/15
- Re: eight-bit char handling in emacs-unicode, Simon Josefsson, 2003/11/15
- Re: eight-bit char handling in emacs-unicode, Simon Josefsson, 2003/11/14
- Re: eight-bit char handling in emacs-unicode, Alex Schroeder, 2003/11/16
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/17
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/18
- Re: eight-bit char handling in emacs-unicode,
Stefan Monnier <=
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/18
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/18
- Re: eight-bit char handling in emacs-unicode, Juri Linkov, 2003/11/19
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/19
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/20
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/20
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Stefan Monnier, 2003/11/21
- Re: eight-bit char handling in emacs-unicode, Kenichi Handa, 2003/11/21