[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #58796] preconv: want option to write traditional [g|t]roff special
From: |
Ingo Schwarze |
Subject: |
[bug #58796] preconv: want option to write traditional [g|t]roff special characters where possible |
Date: |
Sat, 25 Jul 2020 16:41:12 -0400 (EDT) |
User-agent: |
Mozilla/5.0 (X11; OpenBSD amd64; rv:77.0) Gecko/20100101 Firefox/77.0 |
Follow-up Comment #2, bug #58796 (project groff):
Hi Dave,
> a bit of a hack
Not so much, actually. Making good use of pipes is among the design
principles of the whole roff ecosystem, to harmonize with the overall UNIX
design philosophy that every tool should solve one task only, but solve it
well and in a way that facilitates combination with the other tools. In this
sense, groff is actually more UNIXy than mandoc, which does integrate
preconv.
> wrapper for iconv
I would hate it if groff would start requiring iconv. I consider it an
important asset that so far, it does not.
> the language has standard libraries to handle UTF-8
Yes, indeed the C language contains a vast array of C library functions to
deal with wide characters and with multibyte characters. But the design of
these C libary facilities is atrocious, and using something else which is
non-standard would even be worse. Either way, rewriting a program to natively
support wide characters is usually an extremely tedious, extremely intrusive,
very time-consuming and highly error-prone task. Even when done as designed,
it adds horrible complication to the code and makes the code much more
fragile. For samll programs, ways exist to cheat one's way around these
notorious downsides, see my presentation at EuroBSDCon in Beograd a few years
ago. But i doubt something like that could be pulled off for a program as
large as groff, at least not easily.
> not sure why preconv need emit things like \['e] or \[u00E9] at all
Because single-byte 8-bit locales have been obsolete for many years and some
operating systems don't even support them any longer. And even for people
using Linux: almost nobody uses LC_CTYPE=*.Latin-1 nowadays, which would imply
that you could no longer look at the preconv output with a pager. When you do
groff-specific encoding anyway, it's much better to encode all non-ASCII
characters and not force users to adopt an obsolete locale.
While in general, i hate adding options to programs, in particular when it can
be expected that they will be used rarely, i do see that an occasional need
for what Brandon asks for might arise. When picking new options, please don't
forget to look at https://mandoc.bsd.lv/man/man.options.1.html - the groff/man
option space is seriously crowded already, and having several programs in a
single package or in two very closely related packages that all use the same
option letter but each one for a different purpose isn't user-friendly at
all.
Either way, i would judge this task as somewhat low-priority because the
situation that you want to maintain the document source in US-ASCII (which
implies there are only occasional non-ASCII characters in it, otherwise you
would surely maintain the document source in UTF-8 in the first place) yet
that there is a sufficient number of stray wide characters inside that you
want to encode them automatically rather than just manually fixing them one by
one may occasionally occur, but not all that often, i think.
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?58796>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/