bug#39799: 28.0.50; Most emoji sequences don’t render correctly

Subject: bug#39799: 28.0.50; Most emoji sequences don’t render correctly
    Eli> I'd prefer not to add Python as prerequisite for building Emacs.  We
    Eli> already use Awk, so using that'd be fine.
    >> I suck at awk, but my attempt is attached.

    Eli> Thanks.  I wonder if we could make the output more human-readable...
    Eli> Glenn, any advice or comments?

Why does it need to be human-readable? The other files generated from
the unicode data are not particularly readable.

    >> It DTRT for me under Cairo if I change my fontset settings to use
    >> 'Noto Color Emoji' instead of Symbola for:

    Eli> Is that a free font (it's from Google, AFAIK, so it might not be)?  If
    Eli> it is free, we could modify fontset.el to use this font if available.
    Eli> (Or maybe there are better free Emoji fonts out there?)

Its license is Apache 2.0. It seems fairly popular. I have no opinion
either way.

    >> (#x1F300 . #x1F5FF)      ;; Misc Symbols and Pictographs
    >> (#x1F900 . #x1F9FF)      ;; Supplemental Symbols and Pictographs
    >> It matches forward off the first char, so the
    >> composition-function-table entries all have '0' as the number of chars
    >> to match. Would it be better to match backwards?

    Eli> I don't think matching backwards is better in general.  Did you have a
    Eli> reason for thinking it was?

I thought I saw a comment in composite.c that says matching is done
backward, but I see that itʼs done forwards as well.

    >> Weʼd run into the 4-character maximum for that, since some of the
    >> sequences are 7 or more characters long.

    Eli> If the sequences are 7 character long, then the forward-matching
    Eli> pattern will hit the same limitation as well, no?

C-h v composition-function-table says:

    PREV-CHARS is a non-negative integer (less than 4) specifying how many
    characters before C to check the matching with PATTERN.  If it is 0,
    PATTERN must match C and the following characters.  If it is 1,
    PATTERN must match a character before C and the following characters.

which on careful re-reading says that the lookback canʼt be more than
3 characters, but that matching forward has no limit.

    Eli> The ones in 
    Eli> and specifically the flag sequences and the skin color sequences.  At
    Eli> least AFAIU the original report.

As Mike clarified, you need to change the fontsets in order to get
them to display in colour (uncomposed, of course).


