[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#27544: 25.1; Visualization of Unicode bidirectional marks

From: Eli Zaretskii
Subject: bug#27544: 25.1; Visualization of Unicode bidirectional marks
Date: Sat, 01 Jul 2017 13:36:24 +0300

> From: Itai Berli <address@hidden>
> Date: Sat, 1 Jul 2017 12:58:28 +0300
> Emacs supports 12 Unicode bidirectional marks (ALM, RLM, LRM, LRE,
> RLE, LRO, RLO, PDF, FSI, LRI, RLI, and PDI), each of which displays as
> a very thin space. This raises two problems.
> 1. On the one hand, the fact that these inherently invisible
> marks manifest, by default, as thin spaces undermines attempts at
> precise alignment and positioning. Moreover, in the case of LRM, RLM
> and ALM, this behavior contradicts explicit directions given in the
> Unicode
> Bidirectional Algorithm 8.0.0 specifications (section 2.6 Implicit
> Directional Marks):
> > they do not appear in the display
> (To my understanding, this is meant to apply to all bidi marks, even
> if only stated explicitly for LRM, RLM and ALM.)
> 2. On the other hand, the fact that these spaces are so thin as to be
> barely noticeable, and the fact that
> they are indistinguishable from one another makes it difficult to debug
> and resolve strange and/or erroneous behavior that can happen in a
> bidi document, an example of which is given below.

The above is the default way these control characters are displayed.
This default was chosen so as to, on the one hand avoid making them
entirely invisible, as doing that was deemed un-Emacsy, and OTOH make
them barely visible, so that they won't disrupt the legibility of the
displayed text.

However, Emacs being Emacs, this is just the default, and it can be
changed.  The visual appearance of these (and other similar)
characters can be customized via the variable
'glyphless-char-display-control', which is described in the Emacs
manual, and in more detail in the ELisp manual.

> The solution to both problems is to make the bidi marks visible in
> `whitespace` mode only, and to give them glyphs that are (a) easy to
> notice, (b) distinguishable from other whitespace visualization glyphs, (c)
> distinct from one another.

You can do that using 'glyphless-char-display-control'.  If that is
somehow not enough, you could also define a display-table entry for
these characters, specifically for whitespace-mode.  Patches to that
effect are welcome (I think this should be a user option, if we want
such a feature).

> If we were able to visualize the whitespace, we would have realized from
> the beginning that the sequence of characters in this paragraph was, from
> left to right:
> RTL-RTL-RLO-RLO-H-e-l-l-o-PDO-,-PDO-SPACE-w-o-r-l-d-!
> Thus, our first three actions removed the first three characters, leaving us
> with:
> RLO-H-e-l-l-o-PDO-,-PDO-SPACE-w-o-r-l-d-!
> We now realize that even the final, correct form, is in fact littered
> with bidi errors and potential landmines!

Overriding the bidi attributes with the likes of RLO can indeed lead
to confusing display.  Emacs has functions that Lisp applications can
use to discover these confusing situations, where the application
would like to warn users.  See the description of
bidi-find-overridden-directionality in the ELisp manual.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]