[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode confusables and reordering characters considered harmful, a

From: Eli Zaretskii
Subject: Re: Unicode confusables and reordering characters considered harmful, a simple solution
Date: Thu, 04 Nov 2021 11:45:26 +0200

> Date: Thu, 04 Nov 2021 09:14:42 +0000
> From: Gregory Heytings <gregory@heytings.org>
> cc: Daniel Brooks <db48x@db48x.net>, cpitclaudel@gmail.com, 
>     yuri.v.khan@gmail.com, stefan@marxist.se, monnier@iro.umontreal.ca, 
>     emacs-devel@gnu.org
> > The mere presence of these characters is NOT the root cause.  These 
> > characters are legitimate and helpful when used as intended.  See 
> > TUTORIAL.he for a pertinent example.
> But TUTORIAL.he is not a pertinent example, because it's not a file with 
> source code.  It's a pertinent example to show that these characters do 
> have legitimate uses, which is obvious.

It's a pertinent example, because it shows that these characters have
their use in human-readable text of technical nature (which frequently
mixes RTL characters with LTR letters and punctuation).  That is
exactly what happens in comments and strings which use RTL scripts
within source code.

> If you could find an actual source code file in an actual project in
> which these characters are used with their intended purpose, it
> would be a pertinent example.

Why do you need me to find an actual source code which uses those
controls?  Isn't it clear that any human-readable text in comments and
strings in a program's source code can and will use these controls?
How does the tutorial text that explains technical stuff related to a
computer program differ from what a programmer could wish to write in
a comment or a string in his/her program?

Would it be enough if myself I wrote such a source code myself and
show it to you?  That would be an invented example, but so are the
examples in the paper that brought up this subject, so how is that

> Otherwise it is safe and reasonable to assume (as the Rust
> developers did) that the mere presence of these characters in source
> code files is a potential problem and must be flagged as such.

It's easy, that's sure.  Reasonable it isn't.  neither it's safe,
because any user who does want these characters used legitimately will
quickly turn off that warning for good.

So it works for the Rust developers to tick a checkbox, but it isn't a
solution for the problem.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]