[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A few observations regarding tbl

From: Oliver Corff
Subject: Re: A few observations regarding tbl
Date: Sat, 19 Jun 2021 10:02:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1

Thank you for your suggestions and experiments. I think U+2423 is the
way to go, and for the tab character the T in a circle is as close to
tradition as possible.

I fumbled with .tr after reading your mail and it seems how space is
used as a delimiter in general input processing prohibits it from being
recognized as a character which is recognized and accepted as an element
of the input character set by .tr.

In contrary, normal characters can be converted to spaces by .tr ---
just consider the following example:

.tr n iI
This is a string.

I added the substitution pair iI in order a) to signal to .tr that the
space is a valid input and b) in order to have quick check whether .tr
works as expected with spaces in the chain of substitution pairs. It does.

Another possibility I thought of is to use the hard space 0xa0 which
provides a visual space and can trigger, via character substitution, any
desired symbol to appear in its place. This leaves the problem that
readers of the future tutorial might want to study examples by
copy+paste, wondering why nothing works as promised, not seeing spaces
are not spaces.

I'll find a way.


On 18/06/2021 18:07, T. Kurt Bond wrote:
I would have thought .tr would be of use here, but I can't get it to
translate a space to something else, probably due to how its arguments
are parsed. (It doesn't strip off a leading double quote, of course.)
Cursory experiments with .char didn't seem to work either.

So I tried manually replacing spaces with the Unicode character U+2423
(OPEN BOX): ␣ (if that shows up in your mail client).  I remember
seeing spaces represented with a similar glyph in various sources. 
That character is not in the Courier font, but is in DejuVuSansMono,
so I installed the DejaVu fonts (thanks to Peter Schaffter's
<> script) and
that worked ok.  It should be possible to run the tbl source of your
tables through a script that substitutes the OPEN BOX character for
all spaces.

I was going to suggest that the script change tabs into the Unicode
Character CIRCLED LATIN CAPITAL LETTER T, but that's not in
DejaVuSansMono.  (Lesk used a T overstruck on a circle.)

And then I got to thinking: If you want something in Courier (so as to
not have to install a font), you might try substituting Unicode
Character WHITE CIRCLE for spaces and WHITE SQUARE for tabs.  That is
not mnemonic as the circled T and OPEN BOX, alas.  Or you could use
the groff characters \[ci] and \[sq]; that would probably be easier
than messing with Unicode characters.  Or you could try overstriking S
and T on \[ci], although when I try that the result has the bottom of
the S and T over the bottom of the circle.

Anyway, here's the test document I used:

    This is a paragraph.
    .ft DejaVuSansMonoR
    .ft C
    This○is□a◊sentence with Unicode WHITE CIRCLE, and WHITE SQUARE.
    This\[ci]is\[sq]a\[ci]sentence with groff characters \\[ci], and
    Using "\fC.char \\[overstrucks] \\o'\\[ci]\\s[-4]S\\s0'\fP", etc.
    .char \[overstrucks] \o'\[ci]\s[-4]S\s0'
    .char \[overstruckt] \o'\[ci]\s[-4]T\s0'
    .ft C
    This\[overstrucks]is\[overstruckt]a\[overstrucks]sentence with
    overstruck characters and .char.
    This is a paragraph.

and here is what the output looks like:
Does anybody have other ideas?

On Fri, Jun 18, 2021 at 9:16 AM Oliver Corff <
<>> wrote:


    I thought over the subject and I decided to write a new
    introduction to
    tbl, akin to Lesk's introduction, but under FLOSS license and with a
    focus on the gnu extensions.

    For this purpose, I have one obviously uninformed and stupid question.
    How do I show code examples in groff (I think I'll opt for the ms
    set) with the whitespace character 0x20 marked, akin to the
    command in LaTeX? \verb|...| produces the included text in tt courier
    (or another fixed-pitch font suitable for displaying code) with spaces
    shown as blank, whereas \verb*|...| inserts a special symbol for every
    0x20 character.

    Does the ms macro package feature an environment for displaying source
    code, or do I mimick that with font and margin settings?

    Since my introduction will demonstrate things like nospaces, tab
    settings etc., it would be nice to show the spaces in the source code.

    Thanks a lot, and I am happy to take the beating if this question
    demonstrates that I was the last one to ask.


    On 17/06/2021 18:46, G. Branden Robinson wrote:
    > Hi, Oliver!
    > At 2021-06-15T12:39:02+0200, Oliver Corff wrote:
    >> my huge text project which involved typesetting approx. 1,300
    >> tiny, small, large and huge, demonstrated that tbl is a remarkably
    >> powerful and reliable tool for this work, and I can say with
    >> confidence that the question which type of table software to use
    >> (LaTeX? (x)html?  others?) was best answered by tbl which helped me
    >> recreate tables with a fidelity so close to the printed sources
    >> the uninitiated reader could not tell an image of the page from the
    >> typeset reproduction.
    > That's excellent news!
    > What is the copyright licensing status of these 1,300 tables? 
    Is there
    > a chance we could get a small, potentially simplified subset of them
    > under a FLOSS license so that we could use them to illustrate
    GNU tbl's
    > feature set?  An excellent property of Lesk's tbl paper was the
    suite of
    > examples, but we don't have that document in our distribution
    and the
    > few examples in our tbl(1) man page compare poorly.
    > Speaking of the feature set, how much of GNU tbl's feature set
    do you
    > figure you ended up exercising by the end of this project?  Was
    > anything that you expected to use but ended up not needing?
    >> I came across a few very minor discrepancies between expected and
    >> actual behaviour, though.
    >> 1) For the global option "tab(x)", the man page says:
    >>      tab(x) Use the character x instead of a tab to separate
    items in a
    >> line of input data.
    >> This works as long as x is a 7-bit ascii character, it does not
    >> with utf-8 characters. E.g.: "tab(|)" (with the pipe symbol) works,
    >> "tab(¦)" does not work and yields the message: "argument to `tab'
    >> option must be a single character".
    >> I suggest either specifying "7-bit ascii character" in the manpage
    >> and/or make the tbl parser utf8-aware.
    > Hmmm, yes--since tbl parses the table for itself, *roff special
    > character escapes will not serve as a workaround.  And UTF-8 support
    > would be a significant undertaking.
    > I've filed this as <
    >> 2) The global option "nospaces", according to the manpage, is
    >> described as:
    >>      Ignore leading and trailing spaces in data items (GNU tbl
    >> The following point may be a question of correct interpretation of
    >> this statement. Does the underbar "_" qualify as a data item in
    >> terminology? I positively think so, because the manpage states
    >>      If  a  data  line  consists of only ‘_’ or ‘=’, a single
    or double
    >> line, respectively, is drawn across the table at that point;
    >> If my data line consists of a single '_', that line is drawn.
    >> if that '_' is followed by spurious whitespace, then only the '_'
    >> appears in the first cell, and no line is drawn, or a line spanning
    >> the first cell only is drawn. From a logical point of view, this is
    >> clear, as the statement says "consists of only ...", but the
    >> option does not seem to work here as expected.
    > Doug's follow-up to this point seems reasonable.  For me, it
    > the principle I espouse that diligent management of one's
    lexicon is one
    > of the most important things you can do in a software project.
    > When revising the tbl(1) man page in the future, I will attend
    > to the uses of the terms "data line" and "data item", and try to
    > sure they're correct and consistent.
    > I once got partway through a rewrite of tbl(1) (the page) once, with
    > much terminological alteration around "global option", "column
    > specifier", and "column modifier".  I disfavor the term "global
    > because "global" options don't persist beyond a .TS/.TE table
    > not even in the same document.  I don't think novice users'
    concept of
    > something "global" stops anywhere short of the entire file they're
    > editing.
    > I ran out of steam on that project because there was just too
    damn much
    > I wanted to fix about the man page.  Not having a separate
    document (as
    > AT&T tbl had) to point the user to for practical examples was a
    > problem, hence my request above.  Coming up with a good suite of
    > examples is itself a significant undertaking, and while I found the
    > examples contributed by Bernd to be contrived and meager, I couldn't
    > honestly say that they weren't better than nothing.
    > In my ideal world, tbl(1) would describe the syntax of the
    command and
    > its input (or the latter could be migrated to a tbl(7) page--I
    > that would win Ingo's support and it wouldn't bother me at all), and
    > we'd have a separate <> document chock full
    of source alongside
    > rendered examples for users to emulate, experiment with, and build
    > their expertise with.
    > Regards,
    > Branden

T. Kurt Bond, <>, <>

reply via email to

[Prev in Thread] Current Thread [Next in Thread]