[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: grog doesn't detect files that contain .so
From: |
Ingo Schwarze |
Subject: |
Re: grog doesn't detect files that contain .so |
Date: |
Thu, 29 Jun 2017 22:31:35 +0200 |
User-agent: |
Mutt/1.6.2 (2016-07-01) |
Hi Eli,
Eli Zaretskii wrote on Thu, Jun 29, 2017 at 10:00:10PM +0300:
> If 'grog' is such a terrible hack, may I suggest to at least hint at
> that in the documentation, and perhaps even remove it from the Groff
> package? There's nothing I could find in the current documentation
> that contains even a shred of a hint in that direction. The 'grog'
> man page doesn't even have a BUGS section, but it does say that
> 'groffer' heavily depends on 'grog' doing TRT.
>
> Given all this, how should a casual user of Groff succeed in guessing
> that 'grog' is something to stay away of?
I don't know, except that an experienced user could possibly
anticipate that guessing file formats heuristically is usually
unreliably and sometimes a security risk (like, the kernel doesn't
use libmagic, but a well-defined magic or shbang at a well-defined
place at the beginning of the file).
I would certainly support adding a warning to the grog(1) and
groffer(1) manuals, saying that they do mere guesswork, and that
there are many different reasons why that can go wrong.
> Then why are you distributing such a buggy utility?
I don't know, either. Maybe some feel that occasionally, in
interactive use, it is handy as a first guess at the file format
of a file that is not properly marked up, and quicker than reading
through the whole file by hand. But i'd recommend verifying the
result before relying on it even in that arguably sane use case.
> The problem is not with 'man'. I use this script when I need to
> produce formatted man pages in a batch job, usually when I prepare a
> binary package for people who don't necessarily have Groff installed;
> formatted man pages can be viewed with any text browser, like Less.
I know that historically, manual pages have often been preformatted
before installation, but nowadays, that should be avoided. Even
if you assume that nobody is using any character encoding except
UTF-8 and ASCII any longer, you are in a fix: If you install UTF-8,
people using ASCII will be quite unhappy, and if you install ASCII,
quite a few manual pages are broken - not many as badly as
perlunicook(1), but still.
Besides, installing preformatted manuals hinders semantic searching,
Nowadays, i'd consider a system that has neither roff nor mandoc
installed quite broken - almost certainly in violation of POSIX
because i'm not sure you can have man(1) if you neither have roff
nor mandoc...
Admittedly, in rare exceptional cases, installing a few preformatted
manual pages may still be required.
> Anyway, thanks for confirming my fears about 'grog'. Is there any
> similar (but working) utility anywhere that you can recommend?
Again, just look at the annotations in the first line, and if they
are incorrect, it helps everybody to kindly ask the author to fix
them.
You hardly need a software package for that, it is very simple.
Here is an example implemention, written by Marc Espie for
pkg_create(1), which is quite close to your use case if i understand
correctly:
open(my $fh, '<', $fname) or die "Can't read $fname";
my $line = <$fh>;
close $fh;
my @extra = ();
if ($line =~ m/^\'\\\"\s+(.*)$/o) {
for my $letter (split '', $1) {
if ($letter =~ m/[ept]/o) {
push(@extra, "-$letter");
} elsif ($letter eq 'r') {
push(@extra, "-R");
}
}
}
Shortly after that, the program calls
'groff', qw(-mandoc -mtty-char -E -Ww -Tascii -P -c),
@extra, '--', $file);
Just as a simple example, though even that little is good enough
for our whole ports tree of more than 9000 ports - and only about
25 of those still need to install preformatted manuals anyway.
Yours,
Ingo