[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode confusables and reordering characters considered harmful

From: Eli Zaretskii
Subject: Re: Unicode confusables and reordering characters considered harmful
Date: Wed, 03 Nov 2021 15:44:08 +0200

> From: Stefan Kangas <stefan@marxist.se>
> Date: Wed, 3 Nov 2021 12:19:58 +0100
> Cc: Eli Zaretskii <eliz@gnu.org>,
>  Clément Pit-Claudel <cpitclaudel@gmail.com>,
>  Stefan Monnier <monnier@iro.umontreal.ca>,
>  Emacs developers <emacs-devel@gnu.org>
> Depending on how you define it, there is at least one major world
> language (Arabic) that has a RTL script, and other major languages
> such as Urdu, Farsi and Hebrew also use it (and a couple of others
> too).  So I think we should consider to what extent your proposal
> might hurt users of such languages.
> Are these characters important to write comments and strings in any of
> those languages?

Yes, definitely.  Especially when the comments mix RTL characters with
ASCII punctuation and separators (which have "weak" directionality,
and change their actual directionality depending on the surrounding
strong directional text).  This happens quite frequently, because
comments can include arithmetic operators and other similar symbol and
punctuation characters.  Without the formatting controls, this could
make comments and strings almost unreadable in some cases.

> Will your proposal make it harder to type in such languages?

Yes, in some cases.

> If yes, are there less invasive solutions?

Yes: detect the situations where the use of these controls is
suspicious.  For example, the current implementation of
bidi-find-overridden-directionality detects when characters that
normally have left-to-right directionality (example: 'a') are forced
to behave as strong right-to-left characters instead -- this is
something "normal" human-readable text should rarely if ever need to
do, and OTOH its potential to confuse is very high.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]