bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31315: wrong font encoding for fallback font


From: Werner LEMBERG
Subject: bug#31315: wrong font encoding for fallback font
Date: Tue, 01 May 2018 08:36:44 +0200 (CEST)

> And I think you might be mistaken in your interpretation of what
> "gb18030.2000" in the font name means: I think it's the font registry,
> not its encoding.

Yes, but the font registry implies the used encoding to access the
font.

> How sure are you that the encoding of this font is indeed
> gb18030.2000?

Quite sure.  To be more precise: The real encoding of the font is
irrelevant (the Droid Sans Fallback font is a standard TrueType font
that has only a Unicode cmap); what matters is how the font backend
provides the font to the client.  Calling `xlsfonts' I see that X11
offers access as follows.

  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-1
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-2
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-cns11643-3
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb18030.2000-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-gb2312.1980-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-iso10646-1
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0201.1976-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1983-0
  -misc-droid sans fallback-medium-r-normal--0-0-0-0-p-0-jisx0208.1990-0

>> The problem now is that the encoding of the fallback font is not
>> respected.  In the image, the highlighted character is U+83EF, but
>> Emacs incorrectly displays U+51BF instead.
>>
>> The GB 18030 bytes to represent U+51BF are \x83\xEF; this clearly
>> shows that Emacs lacks an iconv call (or an equivalent to that);
>> instead, it seems to simply feed the Unicode value to the font
>> backend.
>
> Tz-tz-tz, how can you even suggest something like that about Emacs ;-)
>
> If you look in xfont_encode_char, you will see that it does encode
> the character before handing it to the font-drawing function.  But I
> see that font-encoding-alist has this to say about gb18030:
>
>  ("gb18030" unicode)
>
> Does replacing that with something like this:
>
>  ("gb18030" (gb18030 . unicode))
>
> solve the problem?

Yes, it seems so.

> What we put in font-encoding-alist now was a deliberate change in
> Jan 2008, in response to a bug report; see
>
>   http://lists.gnu.org/archive/html/emacs-devel/2008-01/msg00754.html
>
> If fonts like this one need to have characters encoded by gb18030,
> then I think we need to change what the value says.

As can be seen above, the font itself doesn't need GB18030.  It's the
font backend that provides this encoding, and Emacs accesses it.

> But this area in Emacs is under-documented, so I'm not sure I've
> got it right, in particular what is the effect of ENCODING and
> REPERTORY in this context.  For most font back-ends, ENCODING is
> ignored, because the back-end is capable to encode the character we
> hand to it.  But the xfont back-end indeed uses Emacs's encoding
> functions to do that externally to the corresponding X APIs.  Which
> might explain why this problem, if indeed we fail to specify the
> correct encoding for this charset, was never reported till now:
> xfont is rarely if ever used.

Emacs doesn't fail to specify the correct encoding.  The problem is
that it feeds the font backend with characters in the wrong encoding
(namely Unicode instead of GB 18030).

>> It's a completely different question why on my system Emacs uses a
>> font encoded in GB 18030 as a fallback font.  It's probably related
>> to the fact that I use `mew' as my e-mail program, manually
>> extended to cover GB 18030.  Unfortunately, I wasn't able yet to
>> trigger the issue with `emacs -Q' (which by default uses iso10646
>> for the fallback font).
>
> Well, we cannot try helping you to unlock this unless you tell how
> you "manually extended" Emacs.

Oh, I haven't extended Emacs, sorry for the bad wording.  I've simply
added a line to mew's elisp code to make it recognize GB18030 in
e-mails.

> In general, the way to request that Emacs uses fonts you like with
> certain characters or charsets is by customizing your fontsets.  I
> cannot say more without hearing the details.

I don't have any fontsets customized in my `.emacs' file.

>> On the other hand, as soon as the problem happens, it happens with
>> any buffer containing CJK characters not displayable with the
>> current font, so it seems a genuine Emacs core bug.
>
> What "problem" do you allude to here?  The first (seemingly
> incorrect encoding) or the second (fallback to this particular
> font)?

Both.  If I open a new file Unicode encoded file, Emacs continues to
use GB18030.2000 as the charset registry/encoding for displaying
fallback characters, failing to convert Unicode to GB18030 before
accessing the characters from the font backend.


    Werner





reply via email to

[Prev in Thread] Current Thread [Next in Thread]