[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible
From: |
G. Branden Robinson |
Subject: |
[bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible |
Date: |
Mon, 21 Aug 2023 05:50:42 -0400 (EDT) |
URL:
<https://savannah.gnu.org/bugs/?64576>
Summary: [pdf.tmac] pdf*href option handling insufficiently
flexible
Group: GNU roff
Submitter: gbranden
Submitted: Mon 21 Aug 2023 09:50:40 AM UTC
Category: Macro - others/general
Severity: 3 - Normal
Item Group: Incorrect behaviour
Status: In Progress
Privacy: Public
Assigned to: gbranden
Open/Closed: Open
Discussion Lock: Any
Planned Release: None
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: Mon 21 Aug 2023 09:50:40 AM UTC By: G. Branden Robinson <gbranden>
This code:
.\"
.\" Macros "pdf:href.flag" and "pdf:href.option"
.\" provide a generic mechanism for switching on flag type options,
.\" and for decoding options with arguments, respectively
.\"
.de pdf:href.flag
.\" ----------------------------------------------------------------------
.\" ----------------------------------------------------------------------
.nr pdf:href\\$1 1
.nr pdf:href.argc 1
..
.de pdf:href.option
.\" ----------------------------------------------------------------------
.\" ----------------------------------------------------------------------
.ds pdf:href\\$1 \\$2
.nr pdf:href.argc 2
...is insufficiently flexible. It assume that its inputs will consist only of
ordinary characters, but special characters and escape sequences, particular
for callers of `pdf:href.option`, are conceivable.
For example, a macro like _groff man_(7)'s `UR`, when used with no link text
(which is a bit lazy, but accepted), will run into problems in cases like the
following.
.P
.I ps2eps
is available from CTAN mirrors, e.g.,
.UR ftp://\:ftp\:.dante\:.de/\:tex\-archive/\:support/\:ps2eps/
.UE .
That's a real example from our _pic_(1) page. One approach to resolving it
implies laboriously walking the arguments to macros that call `pdf:href.flag`
and `pdf:href.option` (which are internals--not externally documented and
therefore not an API), attempting to scrub them of unexpected content, and
getting peevish with other _groff_ developers when encountering arbitrary
_roff_ input that is *unexpectedly* unexpected; see, e.g., bug #64202.
That it is so tedious to iterate through strings in _groff_ (and as I have
said elsewhere, nigh-impossible in AT&T _troff_) is doubtless one of the
factors that turns up the temperature on this problem. See bug #62264 for a
proposed, but not yet implemented, quality-of-life improvement in this area.
Another possibility is simply for _pdf.tmac_- or _pdfmark.tmac_-using
documents and macro packages to be aware of the intolerance/irritability of
its internals, and work around them--for instance, _groff_'s _an.tmac_, when
seeing that a `UR` or `MT` has no link text, could simply inject some known,
well-behaved link text like "(link)", that aforementioned internals won't barf
on. This works (I tried it), but it is pretty lame.
1. That text isn't localized.
2. That text might not appropriate or clear in all situations.
Now, one _could_ kick both of the above back into the user's face. ("Just
supply some link text, damn it!") But for another problem...
3. Worst, you can't format punctuation after it without intervening space.
To do that, you need the `\c` escape sequence, which becomes part of one of
`pdfhref`'s arguments, and _pdfmark.tmac_ / _pdf.tmac_ insist on populating
_roff_ register or string names incorporating each such argument, and we're
back to the original problem of escape sequences.
troff:<standard input>:1473: error: an escaped 'c' is not allowed in an
identifier
And in fact use of `\c` is wholly defeated here--you'll get space (and
possibly a break) before the punctuation anyway. So tossing the burden of
specifying link text--which is supposed to be formatted output in the first
place--on the user and then going aggro on them if they dare to use escape
sequences that are wholly valid in formatted output is not a satisfactory
solution.
Intriguingly, the `\A` escape sequence to test a character sequence for
validity as a _groff_ identifier name has been around since 1991, but
_pdfmark.tmac_ and _pdf.tmac_ don't bother to use it. Possibly this problem
would have been recognized and addressed long ago if they had. It certainly
seems to me like a Recommended Best Practice if one is going to be populating
_groff_ identifiers based on user input (or even _any_ external input, like a
macro package written by someone who isn't as careful as you are). But nobody
ever got a fellowship for validating input, did they?
Moreover, it appears that the main reason _pdfmark.tmac_ / _pdf.tmac_ are
taking this approach is because the _roff_ language doesn't have a list type,
so it's a pain in the ass to search for things. _pdfmark.tmac_ / _pdf.tmac_'s
solution, to use the macro/request/string name space as a dictionary, with the
identifiers as keys and the string contents as values, does have obvious
appeal given that limitation...but for blundering into the other limitations
of assuming either that (a) any input makes a valid identifier, or (b) your
users won't wander off the lit path of ordinary characters. And as noted
above, scrubbing a character sequence for things that are invalid (in _any_
context)--the "sanitiziation problem", is Yet Another pain in the ass. See
bug #62264 again.
Fortunately, the use of this mechanism, in _pdf.tmac_ at least, appears to be
fairly limited.
`pdf.href.flag` would seem to be okay, since its values only ever come from
macro arguments that identify "flags", and these are going to have
straightforward names.
For instance, these seem okay (includes annotations from my working copy).
671 .\" XXX: predefined flag
672 .if !dpdf:href-D .pdf:href.option -D \\$1
673 .if '\\*[pdf:href-D]'' \{\
674 . pdf:error pdfhref has no destination
675 . nr pdf:href.ok 0
676 . \}
690 .\" XXX: predefined flag
691 .if dpdf:href-P \&\\*[pdf:href-P]\c
692 .ie \\n[pdf:href.ok] \{\
693 . \"
[~40 lines of brace scope follow]
No, the problem seems to be limited to eating what, on the Unix command line,
we'd call operands and option arguments, but which can be URLs with escape
sequences like \: and \c in them, and spitting them verbatim into suffixes on
_roff_ identifiers, and that just doesn't work in general.
423 . \"
424 . \" Handle the case where subcommand is specified as "-class",
425 . \" setting up appropriate macro aliases for subcommand handlers.
426 . \"
427 .\" XXX
428 . if dpdf*href\\$1 .als pdf*href pdf*href\\$1
429 . if dpdf*href\\$1.link .als pdf*href.link pdf*href\\$1.link
430 . if dpdf*href\\$1.file .als pdf*href.file pdf*href\\$1.file
431 . \"
432 . \" Repeat macro alias setup
433 . \" for the case where the subcommand is specified as "class",
434 . \" (without a leading hyphen)
435 . \"
436 .\" XXX
437 . if dpdf*href-\\$1 .als pdf*href pdf*href-\\$1
438 . if dpdf*href-\\$1.link .als pdf*href.link pdf*href-\\$1.link
439 . if dpdf*href-\\$1.file .als pdf*href.file pdf*href-\\$1.file
An immense amount of code in _pdf.tmac_ seems to be dedicated to an
exploration of the question "hey, what if we chucked established _roff_
programming idioms out the window and re-implemented _getopt_long_(3) in it so
that shell script programmers had macro interfaces that looked vaguely
familiar"?
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?64576>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible,
G. Branden Robinson <=
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, G. Branden Robinson, 2023/08/21
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, G. Branden Robinson, 2023/08/21
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, G. Branden Robinson, 2023/08/22
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, Deri James, 2023/08/22
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, G. Branden Robinson, 2023/08/22
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, Deri James, 2023/08/26
- [bug #64576] [pdf.tmac] pdf*href option handling insufficiently flexible, Deri James, 2023/08/27