[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode confusables and reordering characters considered harmful, a

From: Eli Zaretskii
Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution
Date: Thu, 04 Nov 2021 13:20:43 +0200

> Date: Thu, 04 Nov 2021 10:41:41 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: cpitclaudel@gmail.com, stefan@marxist.se, emacs-devel@gnu.org, 
>     db48x@db48x.net, monnier@iro.umontreal.ca, yuri.v.khan@gmail.com
> >> If you could find an actual source code file in an actual project in 
> >> which these characters are used with their intended purpose, it would 
> >> be a pertinent example.
> >
> > Why do you need me to find an actual source code which uses those 
> > controls?  Isn't it clear that any human-readable text in comments and 
> > strings in a program's source code can and will use these controls? How 
> > does the tutorial text that explains technical stuff related to a 
> > computer program differ from what a programmer could wish to write in a 
> > comment or a string in his/her program?
> >
> >From a theoretical point of view, that's correct.  From a practical point 
> of view, if these controls characters are only found in 0.01% of the files 
> that are hosted on, say, GitLab, and given that these controls can have a 
> dangerous effect, it is reasonable for an editor to make them stand out. 

Since when is it OK to flag characters that are used very rarely?
What would be the sense of doing that?  Should we perhaps flag all the
Egyptian hieroglyphs for the same reason?

> Just like Emacs makes no-break spaces stand out for example (although 
> AFAIK they are not dangerous in any way), with a thin brown line.

It isn't "just like", because those no-break spaces are very
frequent.  I see them almost every day in the email messages I receive
and read.

> AFAIU the solutions you propose are:
> 1. Customize glyphless-char-display-control to display all control 
> characters in a different way.  This is a much cruder solution, it would 
> also have an effect for example on ZWNJ which might be undesirable, and it 
> is also not buffer-local.  Users who want to use these characters 
> legitimately are unlikely to use that solution.
> 2. Improve bidi-find-overridden-directionality to detect such 
> non-legitimate cases.  This has to be done.
> In comparison, the minor-mode exists, it's a small patch, and it's 
> orthogonal to the two solutions you propose.

Small doesn't necessarily mean good.

> Anyway, I think it is time to abandon all hope.

It would be a shame if we abandoned all hope to solve this issue in a
good way.  I, for one, don't abandon hope in this matter.  Making
glyphless-char-display-control support buffer-local customizations is
one way of working on solving the issue better than by displaying
arbitrary glyphs instead of them.  bidi-find-overridden-directionality
will be extended soon to find the problematic text in the examples
from that paper.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]