[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposed: 3 disruptive changes for groff 1.23.0

From: Ingo Schwarze
Subject: Re: Proposed: 3 disruptive changes for groff 1.23.0
Date: Tue, 29 Jun 2021 02:40:08 +0200
User-agent: Mutt/1.12.2 (2019-09-21)

Hi Doug,

Douglas McIlroy wrote on Mon, Jun 28, 2021 at 06:05:53PM -0400:

>> Not using such a file [a tmac  striipper] makes the software less effective;
>> thus such a move ["skip the stripper"] is simply a sabotage.

> I am not at all convinced of the first claim above. Please provide some
> hard evidence for it. (A simple assertion that stripping dramatically
> shortens some tmac files is not evidence for any effect on software
> effectiveness, i.e. correct performance on all inputs, and timely
> performance on realistic inputs.)

I hesitate spending a lot of time for doing a rigorous measurement,
but here is one data point:  The OpenBSD ksh(1) manual page is among
the largest and most relevant real-world mdoc(7) manual pages that
i'm aware of.  When it is already in the buffer cache, times for
formatting it with commands like

  time groff -mdoc -Tascii ksh.1
  time mandoc -mdoc -Tascii ksh.1

are, on my notebook, about

  0.69 to 0.73 seconds    with groff and stripped mdoc macros
  0.75 to 0.77 seconds    with groff and unstripped mdoc macros
  about 0.04 seconds      with mandoc
  about 0.10 seconds      with mandoc when the page
                          is not yet in the buffer cache

So, physical reading from disk takes about 1/20 of a second,
mandoc takes about the same time again for formatting,
groff takes about ten times the time of mandoc for formatting
(which surprises me a bit, what i remebered was more like a factor
of three than a factor of ten, but that was years ago, lots of
things may have happened in the meantime).

The difference between stripped and unstripped macros appears to
be measurable, but likely below 10%, which is a tiny effect compared
to performance differences between different implementations.  Either
way, i don't think the difference between 0.71s and 0.76s is
particularly relevant for any conceivable application.  For typical
interactive use, the difference between between a response time of
0.1s and 0.7s may be noticable for impatient users, but i don't
consider even that a serious issue.

Times for PostScript and PDF are about:

  0.79 to 0.82 seconds    groff -Tps stripped
  0.83 to 0.85 seconds    groff -Tps unstripped

  2.34 to 2.38 seconds    groff -Tpdf stripped
  2.37 to 2.57 seconds    groff -Tpdf unstripped

So surprisingly, even though real typesetting takes longer than
terminal output, the performance loss is harder to measure in the
typesetting case.  Besides, real typesetting is rarely done
interactively, so even a 10% performance loss would be less relevant
than for terminal output.  I'm not providing performance numbers
for mandoc -Tps / -Tpdf because output quality of mandoc in these
modes is so bad that a performance comparison would not make sense.

The other macro sets in question, me and hdtbl, are significantly
simpler than mdoc, and almost never used interactively, so unless
shown otherwise, i think it is reasonable to assume that they do
not suffer in a practically relevant way either.

> Barring a surprise answer above, I vote a vigorous yes. Stripping, I
> believe, gratuitously impairs readability. If an infelicitous tmac file
> deploys so many comments and indenting spaces within time-significant
> macros as to perceptibly affect performance, the right solution is to
> correct, not embalm, these rare stylistic flaws .
> Furthermore, stripping is almost certainly impossible to do right.
> How, for example, do you know that a line in a macro that begins .\" is
> a comment? You have no idea whether .  will be the control character
> when the macro is expanded. Yes, it's a cooked-up example that can
> be overcome by an equally cooked-up -u flag in the source repository.
> Occom would not approve of this multiplication of entities.

Yes.  I believe that is a good summary of the main arguments
against stripping.  Also, it did happen in the past that stripping
introduced bugs, so your argument is not a hypothetical one.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]