Re: Unicode confusables and reordering characters considered harmful

From: Stefan Monnier
Subject: Re: Unicode confusables and reordering characters considered harmful
Date: Wed, 03 Nov 2021 08:20:01 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)

> AFAIK, these specific characters are not necessary to write comments and
>  strings in these languages.  Here are two random file which use RTL strings
> and comments, and in which these characters are not used:

I was more worried about the fact that, while highlighting those chars
might be helpful to warn about accidental uses of them, if attackers
want to trick the reader, I'm pretty sure they can get similar results
without having to use those special LTR/RTL override chars:

    int hi = 5;
    int שָׁלוֹם = hi;
    int hello = 10;
    int السّلامعليك = hello;
    myfun(שָׁלוֹם ,السّلامعليكم)

There's no override here, but did I call `myfun` with args 5 and 10 or
did I call it with args 10 and 5?

[ OK, admittedly, for a bidi-idiot like me, it looks like neither since
  the Arabic shaping of the two occurrences of the identifier actually look
  different (and I truly have no clue why that is here), so I'm lead to
  believe that the second is a reference to a non-existing
  variable ;-)  ]


