[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: groff for epub/e-books (was: groff 1.22.4 mandb 2.11.2 man -H tbl no

From: Steve Izma
Subject: Re: groff for epub/e-books (was: groff 1.22.4 mandb 2.11.2 man -H tbl not rendered)
Date: Thu, 23 Feb 2023 22:55:09 -0500

On Wed, Feb 22, 2023 at 08:26:19PM -0600, G. Branden Robinson wrote:
> Subject: groff for epub/e-books (was: groff 1.22.4 mandb 2.11.2 man -H tbl
>  not rendered)

> > [And prefer "open" ePub, vs *proprietary* Adobe PS and PDF,
> PDF is an international standard these days,[1] though I wonder
> if it isn't a captive one by Adobe as "Office Open XML" is by
> Microsoft.[2]

However, it doesn't mean that Adobe's programmers don't make
mistakes when working on this proprietary code. When trying to
work with PDFs using open source software I often find that I
need to use pdftk or something similar to re-distill PDFs that
don't behave properly. These are always from Microsoft or Apple

> > as it's zipped html readable in browser addons, or vim or
> > less(/open) if desperate. Get on it FSF!]
> I _do_ think ePub would make a good application for groff, and
> it's something I've given some thought to.  One thing EPUBs
> often need to do is reflow and re-render the text, because
> someone make take a tablet or phone display and rotate it
> frequently.  EPUBs _can_ do this, but in my experience, they
> often do it poorly.

Are there two different discussions going on here? I think I hear
some people talking about converting groff files to epub and
others perhaps hinting that groff should be the engine driving
ebook readers.

In my experience, going from arbitrary groff source files to the
kind of HTML code needed for epubs (especially the increasingly
mandated accessible epubs) is not worth the effort, unless the
groff source code is done in a highly structured way (i.e.,
essentially following XML rules).

I would love it if the second possibility were to be undertaken.
I read a lot of epubs and they are rendered poorly, especially in
respect to hyphenation (if it even exists on the particular
platform you have), but also in respect to control of margins,
line spacing (e.g., if footnote numbers occur in a line), and if
anything other than straight, paragraphed text needs to be
displayed. And even pagination itself is usually terrible. If I
see extra white space at the bottom of a page, my sub-conscious
reading-comprehension starts to wind down, assuming I've reached
the end of a chapter or section. Unnecessary and distracting
white space shows up a lot in epub rendering.

Since I still work in publishing, sales of epubs are very
important to me. In Canada, at least, and I think everywhere
else, sales have plateaued at less than 20% of a publisher's
income. In many cases publishers have stopped selling them
actively because the encryption process stymies too many users
and takes up too much staff time for customer service. In other
cases, publishers have started boosting the prices of epubs to
just below that of print books because they can't cover costs.
For a long time, Amazon and Apple (in particular) manipulated the
market to keep prices low and keep the majority of the sales
income, but now their control over this is slipping and prices
are going up for everything other than pulp fiction.

> PDF apparently doesn't handle this well, which is one of the
> reason a bunch of "e-book" document formats popped up.  I've
> been frustrated with every one I've encountered.

In the academic world, epubs render tables, references,
footnotes, figures, etc. in extremely awkward ways, and in many
cases people prefer PDFs for their increased readability, but on
anything smaller than about a 7-inch screen, PDFs can also be
very painful.

I still prefer reading on paper and I still prefer typesetting
for paper as well.

> I have noticed that groff generally renders so fast on modern
> hardware that I'll wager that a "groff ePub" document could
> ship the document _source_ and an "ePub reader" for it would
> provide the entire groff rendering system.  (For documents that
> are slow to render even with this approach, you could
> straightforwardly cache the intermediate output for each
> display orientation.)  I don't see how this would require any
> architectural changes to groff itself, and would have many
> advantages, particularly for document source accessibility,
> archivability, preservation, and "share-alike" licensing
> properties.
> (So major publishers would probably hate it and oppose it with
> fury.)

Publishers wouldn't even notice. It's the manufacturers of ebook
readers who make this difficult. One way to start, probably,
would be to attach an e-ink screen to a Raspberry PI and run
groff to display the HTML from the epub, or the native groff
document. That shouldn't be too hard. It might be a lot easier if
someone would convert groff into libraries for something like
python. That would probably be more efficient in handling the
display than using pipes between processes.

I'd love to have a python-groff module. It would simplfy a lot of
the document transformations I spend a lot of time on, e.g., XML
to PDF, plain text to PDF, etc.

        -- Steve

Steve Izma
Home: 35 Locust St., Kitchener, Ontario, Canada  N2H 1W6
E-mail:  phone: 519-745-1313
cell (text only; not frequently checked): 519-998-2684

The most erroneous stories are those we think we know best – and
therefore never scrutinize or question.
    -- Stephen Jay Gould, *Full House: The Spread of Excellence
       from Plato to Darwin*, 1996

reply via email to

[Prev in Thread] Current Thread [Next in Thread]