[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#49066: 26.3; Segmentation fault on specific utf8 string
From: |
Eli Zaretskii |
Subject: |
bug#49066: 26.3; Segmentation fault on specific utf8 string |
Date: |
Thu, 17 Jun 2021 16:59:42 +0300 |
> From: Robert Pluim <rpluim@gmail.com>
> Cc: larsi@gnus.org, 49066@debbugs.gnu.org, mvsfrasson@gmail.com
> Date: Thu, 17 Jun 2021 15:07:18 +0200
>
> Full backtrace from an unoptimized build:
Thanks.
> >> Thread 1 "emacs" received signal SIGSEGV, Segmentation fault.
> >> ftfont_shape_by_flt (matrix=<optimized out>, otf=<optimized out>,
> ft_face=<optimized out>, font=<optimized out>, lgstring=...)
> >> at ftfont.c:2573
> >> 2573 g->g.to = LGLYPH_TO (LGSTRING_GLYPH (lgstring, g->g.to));
>
> Eli> So, is 'g' a NULL pointer or something? Or is 'lgstring' faulty in
> Eli> some way? IOW, what is the immediate reason for the
> Eli> segfault?
>
> Itʼs lgstring, I think this is one of those 'nil's in lgstring
Yes, I think so. We can verify that by looking at the value of
g->g.to:
(gdb) p *g
$3 = {
g = {
c = 2453,
code = 20,
from = 0,
to = 2, <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
And the LGLYPH whose index is 2 is indeed nil:
(gdb) pp lgstring
[[#<font-object "-GOOG-Noto Sans
Bengali-normal-normal-normal-*-19-*-*-*-*-0-iso10646-1"> 2453 8204] nil [0 0
2453 20 16 -1 17 12 0 nil] [1 1 8204 658 0 -1 1 15 4 nil] nil nil nil [5 5 0
3039 11 0 12 7 5 nil] [6 6 1606 1044 11 0 11 8 3 nil] nil] ^^^
I think this is a bug in that loop: it should actually exit whenever
it finds the first LGLYPH that is nil, and update gstring.used
accordingly. Something like this:
for (i = 0; i < gstring.used; i++)
{
MFLTGlyphFT *g = (MFLTGlyphFT *) (gstring.glyphs) + i;
if (NILP (LGSTRING_GLYPH (lgstring, g->g.from))
|| NILP (LGSTRING_GLYPH (lgstring, g->g.to)))
break;
g->g.from = LGLYPH_FROM (LGSTRING_GLYPH (lgstring, g->g.from));
g->g.to = LGLYPH_TO (LGSTRING_GLYPH (lgstring, g->g.to));
}
gstring.used = i;
CC'ing Handa-san, as I'm not really familiar with this code.
> This is enough to cause the crash: ক
>
> Thats #x995 followed by #x200c. Why are we trying to compose a ZWNJ?
Because #x995 is a Bengali character, and lisp/language/indian.el
says:
(defconst bengali-composable-pattern
(let ((table
'(("a" . "\u0981") ; SIGN CANDRABINDU
("A" . "[\u0982\u0983]") ; SIGN ANUSVARA .. VISARGA
("V" . "[\u0985-\u0994\u09E0\u09E1]") ; independent vowel
("C" . "[\u0995-\u09B9\u09DC-\u09DF\u09F1]") ; consonant
("B" . "[\u09AC\u09AF\u09B0\u09F0]") ; BA, YA, RA
("R" . "[\u09B0\u09F0]") ; RA
("n" . "\u09BC") ; NUKTA
("v" . "[\u09BE-\u09CC\u09D7\u09E2\u09E3]") ; vowel sign
("H" . "\u09CD") ; HALANT
("T" . "\u09CE") ; KHANDA TA
("N" . "\u200C") ; ZWNJ <<<<<<<<<<<<<<<<<<<<<<<<<<<
("J" . "\u200D") ; ZWJ
("X" . "[\u0980-\u09FF]")))) ; all coverage
(indian-compose-regexp
(concat
;; syllables with an independent vowel, or
"\\(?:RH\\)?Vn?\\(?:J?HB\\)?v*n?a?A?\\|"
;; consonant-based syllables, or
"Cn?\\(?:J?HJ?Cn?\\)*\\(?:H[NJ]?\\|v*[NJ]?v?a?A?\\)\\|"
;; another syllables with an independent vowel, or
"\\(?:RH\\)?T\\|"
;; special consonant form, or
"JHB\\|"
;; any other singleton characters
"X")
table))
"Regexp matching a composable sequence of Bengali characters.")
(which is used below that in setting up composition-function-table for
Bengali characters).
> Eli> It could be some problem with the shaping engine: I guess versions
> Eli> after Emacs 26 are built with HarfBuzz, not m17n-flt? If you
> forcibly
> Eli> use m17n-flt in a later Emacs, does it still not crash?
>
> emacs-27 built '--without-harfbuzz' and thus with m17n-flt crashes the same
> way.
Yes, it figures.
I hope Handa-san will suggest a solution, for those who want to stick
with m17n-flt.
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Miguel V. S. Frasson, 2021/06/16
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Lars Ingebrigtsen, 2021/06/16
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Eli Zaretskii, 2021/06/17
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Robert Pluim, 2021/06/17
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Eli Zaretskii, 2021/06/17
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Robert Pluim, 2021/06/17
- bug#49066: 26.3; Segmentation fault on specific utf8 string,
Eli Zaretskii <=
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Eli Zaretskii, 2021/06/17
- bug#49066: 26.3; Segmentation fault on specific utf8 string, handa, 2021/06/26
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Eli Zaretskii, 2021/06/27
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Paul Eggert, 2021/06/27
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Eli Zaretskii, 2021/06/27
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Robert Pluim, 2021/06/28
- bug#49066: 26.3; Segmentation fault on specific utf8 string, Eli Zaretskii, 2021/06/28
bug#49066: file foo, Miguel V. S. Frasson, 2021/06/16