[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #63334] \[u....] syntax for ASCII characters handled inconsistently

From: Dave
Subject: [bug #63334] \[u....] syntax for ASCII characters handled inconsistently
Date: Tue, 8 Nov 2022 04:34:05 -0500 (EST)


                 Summary: \[u....] syntax for ASCII characters handled
                 Project: GNU troff
               Submitter: barx
               Submitted: Tue 08 Nov 2022 03:34:02 AM CST
                Category: Core
                Severity: 2 - Minor
              Item Group: Warning/Suspicious behaviour
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any
         Planned Release: None


Follow-up Comments:

Date: Tue 08 Nov 2022 03:34:02 AM CST By: Dave <barx>
I see the same behavior in groff 1.22.4 and in the latest git code.  (And for
that matter, going all the way back to at least 1.19.2.)

ASCII characters represented in \[u....] form are handled inconsistently.  A
simple demonstration of the difference:

$ echo '\[u0021]\[u0022]' | nroff | cat -s
troff: <standard input>:1: warning: can't find special character '\!'

\[u0022] is correctly converted to, and output as, a quotation mark.  But
\[u0021], rather than being converted to a "!", is for some reason converted
to the sequence "\!", which (unsurprisingly) is not a recognized character.

It's not clear to me what internal mechanism might cause this: if "\[u0021]"
were parsed as a backslash followed by "[u0021]", the bracketed sequence
wouldn't be specially interpreted at all.

Looking at all the pre-alphabet ASCII symbols:

$ printf "\\[u%04x] " $(seq 32 64) | nroff | cat -s

Five of them are handled as expected, 15 are converted to unrecognized \
characters, and 13 are not recognized at all.  

That last case I don't consider a bug, since (current) groff does not specify
that any of them should be recognized.  (The 1.22.4 groff_char(7) page sort of
gave the impression that some of them would be, but these sequences have been
removed from the drastically rewritten 1.23 groff_char(7).)  Arguably, none of
this is a bug, since no documentation explicitly states that, for example,
"\[u0021]" will be recognized as "!".  But the way it _is_ handled is
surprising enough that I wanted to at least bring it to the development team's


Reply to this item at:


Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]