groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] [PATCH] new requests to case-transform string register value


From: G. Branden Robinson
Subject: Re: [groff] [PATCH] new requests to case-transform string register values
Date: Mon, 9 Sep 2019 02:35:36 +1000
User-agent: NeoMutt/20180716

At 2019-07-13T20:59:14+0200, Ingo Schwarze wrote:
> Hi Branden,

Hi Ingo!

Following up on this at last.  :)

> Your suggested naming seems good to me.  It is descriptive without
> being excessively long.  As far as i can see, it harmonizes with
> the naming of existing requests.  In particular, the proposed
> syntax, semantics, and naming as similar enough to the existing
> .substring request that the similarity of the name makes sense.

Thanks!  This was the part I was most worried about; we can tweak the
documentation endlessly, but implementing botched requests that we have
to rename or break the compatibility of later is more painful.

> > +.REQ .stringdown "stringvar"
> > +Transform each letter in\~\c
> > +.I stringvar
> > +to its lowercase version.
> 
> This feels unclear in three respects: one might understand that
> 
>   .stringdown Word
> 
> would put "word" into the output document, where actually, the
> user-defined string called "Word" is modified in place and nothing
> is put into the output, and one might wrongly expect that multi-byte
> characters are somehow handled with towlower(3).  I think this would
> be clearer:
> 
>   Change the string named
>   .IR stringvar
>   by replacing each byte with its lower-case version.

I went with something pretty close to your wording, above.

> The word "change" is needed to make it clear that the replacement
> happens in place rather than writing something to the output,
> the word "named" is needed to make it clear that the argument is
> not the string that will be processed, and the word "byte" is
> needed to make it clear that this only works for single-byte
> characters.  That way, the surprising behaviour of changing \['e]
> to \['E] is also implicitly specified.

Yes, that is economical.

> Also, please avoid groffisms and escapes unless they are really needed:
> 
>   in\~\c
>   .I stringvar
> 
> is better written as just:
> 
>   in
>   .I stringvar

Agreed.  This came from copy-and-pasting other material in groff(7) and
groff_diff(7), which do this all over the place--often unnecessarily, as
you note.

> I see no reason to suppress a line break at this point, and even if so,
> 
>   .RI in\~ stringvar
> 
> would be better.

Yup.  The only times I know of that we need constructs like
\~\c<newline> in man pages are:

(1) to extend an input line for a macro that treats the next input line
specially; usually this is .TP, but there are other possibilities which
we fortunately do not often see in the wild.

(2) to extend an input line when filling is disabled.  This should only
happen in .EX/.EE sections for "clean" man pages.

>   If the string contains an escape sequence, the sequence itself
>   will be transformed, not the resulting output; for exmample,
>   .stringup transforms \(or (bitwise or) to \(OR (logical or).

I ended up just letting the example do this work for me in groff_diff(7)
and the Texinfo manual.

> better: "each byte" and s/$/in place/.

I took the former but not the latter, pretty much just because I wanted
to keep the comment length to one line.

> > +void do_string_case_transform(case_xform_mode mode)
> > +{
> > +  if ((mode != STRING_DOWNCASE) && (mode != STRING_UPCASE)) {
> > +    error("impossible string case transformation mode %1!", mode);
> 
> This cannot happen.  Consequently, shouldn't it be assert()
> rather than error()?

Done (with sense of test inverted and De Morgan's Law applied).

[...]
> > +      if (mode == STRING_DOWNCASE)
> > +   nc = tolower(c);
> 
> Indentation looks weird here.

Yes.  The style of the file is to replace every 8 spaces with literal
tabs come hell or high water.  That does not align properly when diffed.
But it's the GNU Coding Style guidelines and it's not a fight I want at
this time.

There's a _lot_ about the C++ style in groff that is at strong variance
with my preferences.

Thanks for the code review!

Stage 2 is add registers to an-old.tmac to configurably drive .stringup
requests on page titles and section headings, for those who insist that
they must appear that way.

Stage 3 is to restore mixed case to all page titles and section headings
in our 61 man page source files.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]