Re: Unicode confusables and reordering characters considered harmful

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Unicode confusables and reordering characters considered harmful

From:	Eli Zaretskii
Subject:	Re: Unicode confusables and reordering characters considered harmful
Date:	Thu, 04 Nov 2021 10:21:12 +0200

> From: Reini Urban <reini.urban@gmail.com>
> Date: Thu, 4 Nov 2021 08:50:14 +0100
> Cc: emacs-devel@gnu.org
> 
>      int hi = 5;
>      int שָׁלוֹם = hi;
>      int hello = 10;
>      int السّلامعليك = hello;
>      myfun(שָׁלוֹם ,السّلامعليكم)
> 
>  IMO this code is fundamentally valid: we should allow
>  programmers to write identifiers in their native tongue.
> 
> Sure, nobody wants to forbid unicode identifiers. The rules only ensure that 
> identifiers keep identifiable. 
> I converted itto perl (because I dislike java or rust), and ran it through 
> cperl.
> The problem is that from an innocent look or code review you won't see any 
> problem, hence the security
> risk.
> You need to adjust your tools.
> 
> But the very first RTL identifier שָׁלוֹם contains already non-identifier 
> characters.

Which of its characters are non-identifier, and why?  That identifier
uses characters of a single script, AFAICT.

> So I cannot tell you if this code doesn't violate any of the 4 unicode mixed 
> script profiles
> (http://www.unicode.org/reports/tr39/#Mixed_Script_Detection 2-5)
> Or if any of the unreadable characters are of the recommended scripts:

Which characters in that fragment are "unreadable" for this purpose?

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [authors: default bidi-display-reordering is set to t] (was: Unicode confusables and reordering characters considered harmful), (continued)
- Re: Unicode confusables and reordering characters considered harmful, tomas, 2021/11/02
  - Re: Unicode confusables and reordering characters considered harmful, Stefan Kangas, 2021/11/02
- Re: Unicode confusables and reordering characters considered harmful, Eli Zaretskii, 2021/11/02
- Re: Unicode confusables and reordering characters considered harmful, Clément Pit-Claudel, 2021/11/02
  - Re: Unicode confusables and reordering characters considered harmful, Reini Urban, 2021/11/03
    - Re: Unicode confusables and reordering characters considered harmful, Stefan Monnier, 2021/11/03
    - Re: Unicode confusables and reordering characters considered harmful, Reini Urban, 2021/11/04
    - Re: Unicode confusables and reordering characters considered harmful, Eli Zaretskii <=
    - Re: Unicode confusables and reordering characters considered harmful, Eli Zaretskii, 2021/11/03
- Re: Unicode confusables and reordering characters considered harmful, Stefan Kangas, 2021/11/02
- Re: Unicode confusables considered harmful, Vasilij Schneidermann, 2021/11/05
  - Re: Unicode confusables considered harmful, Eli Zaretskii, 2021/11/05
    - Re: Unicode confusables considered harmful, Vasilij Schneidermann, 2021/11/06
    - Re: Unicode confusables considered harmful, Eli Zaretskii, 2021/11/06
    - Re: Unicode confusables considered harmful, Vasilij Schneidermann, 2021/11/06
    - Re: Unicode confusables considered harmful, Eli Zaretskii, 2021/11/06
  - Re: Unicode confusables considered harmful, Stefan Monnier, 2021/11/05
- Re: Unicode confusables and reordering characters considered harmful, Dmitry Gutov, 2021/11/10

Prev by Date: Warning compiling image-dired.el
Next by Date: Re: Emacs Lisp code formatting
Previous by thread: Re: Unicode confusables and reordering characters considered harmful
Next by thread: Re: Unicode confusables and reordering characters considered harmful
Index(es):
- Date
- Thread