[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte se

From: Stefan Monnier
Subject: bug#11073: 24.0.94; BIDI-related crash in redisplay with certain byte sequences
Date: Thu, 29 Mar 2012 12:04:22 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.94 (gnu/linux)

>> I understand this part.  The part I don't understand is why we do
>> unification when reading a char from the buffer's text.  That is: why
>> unify chars in `int' (or Lisp_Object) form but not in the
>> internal-utf-8 representation?

>> I would expect the unification to happen during encoding/decoding

> Usually, yes.  But as far as there is a code space in high
> area for a CJK charset, it is unavoidable to have a
> buffer/string that contains a character represented by a
> byte sequence in that high area as the test case of
> Bug#11073.  And, as "unification" means to treat such a
> character the same way as the unified character, I thought
> they both have the same character code.

Since there are two internal byte-sequence representation, I don't see
any good reason why we shouldn't have 2 internal int representations.
I.e. if unification failed for the byte-sequence (which might be the
result of a bug, for all I know), we may as well keep them non-unified
in the int representation.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]