[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #58930] take baby steps toward Unicode
From: |
Dave |
Subject: |
[bug #58930] take baby steps toward Unicode |
Date: |
Sat, 28 May 2022 13:35:56 -0400 (EDT) |
Follow-up Comment #13, bug #58930 (project groff):
[comment #0 original submission:]
> But if the input is some other encoding, preconv converts
> the character into the string "\[u00A0]", which groff does
> _not_ recognize.
The resolved bug #62300 has fixed preconv to emit "\~" rather than "\[u00A0]"
for a U+00A0 input character.
In preconv 1.22.4:
$ echo -e '\xA0' | preconv -elatin1
.lf 1 -
\[u00A0]
In preconv built from the latest code:
$ echo -e '\xA0' | preconv -elatin1
.lf 1 -
\~
So I think we can mark this part as resolved, despite one remaining issue
62300 points out in its comment 2:
"The input sequence '\[u00A0]' is _syntactically_ valid...but like '\[uFFFF]'
and '\[u0000]', it's not _meaningful_"
This is true of the current implementation but less true conceptually: U+0000
and U+FFFF are not meaningful input characters to groff, but U+00A0 is, and
users ideally ought to be able to specify the character as \[u00A0].
But this is an edge case I don't intend to pursue. Users who want to stick to
pure-ASCII input have the escape sequence \~ to specify the nonbreaking space,
so don't need the alternate spelling \[u00A0].
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?58930>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #58930] take baby steps toward Unicode,
Dave <=