[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode confusables and reordering characters considered harmful

From: Vasilij Schneidermann
Subject: Unicode confusables and reordering characters considered harmful
Date: Tue, 2 Nov 2021 13:57:20 +0100

There's a paper going around that demonstrates how two Unicode features
can be used to trick source code auditors into misinterpreting program
logic. The authors have suggested that language specifications should be
amended, implementations should warn or raise errors and editor tooling
should display visual warnings. Both issues are tracked as
CVE-2021-42574 and CVE-2021-42694.

The first issue is about bidirectional reordering characters. If bidi
text rendering is not needed, it's easy enough to work around with
`(setq-default bidi-display-reordering nil)`. Some people already make
use of this to speed up redisplay. Maybe there's a better solution, such
as automatically detecting whether the user is working with a RTL script
and only then enable bidi text rendering.

The second issue is about mixed-script confusable characters. Emacs does
not appear to have a workaround for that. I've come across the
uni-confusables package in GNU ELPA, but it merely sets up character
tables. The only mention of confusables I can find in the Emacs sources
is for `help-uni-confusables` which contains a much smaller list for
quotation marks, used in help buffers and elisp buffers. A
possible solution would be to implement the Unicode confusables
algorithm and expose it in the uni-confusables package.



Attachment: signature.asc
Description: PGP signature

reply via email to

[Prev in Thread] Current Thread [Next in Thread]