[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Warn on mid-input line sentence endings
From: |
G. Branden Robinson |
Subject: |
Re: Warn on mid-input line sentence endings |
Date: |
Sun, 30 Apr 2023 07:34:57 -0500 |
At 2023-04-30T03:04:27+0200, Alejandro Colomar wrote:
> On 4/30/23 02:05, G. Branden Robinson wrote:
> > I should have said "_Warn on_ semantic newlines" is a terrible
> > instruction/summary.
>
> That's why I used the phrase (at least I tried to do it consistently
> recently) "warn on S. N. violations".
Alas, it got lost in the most recent thread subject line on this topic
to the groff list...
https://lists.gnu.org/archive/html/groff/2023-04/msg00334.html
Hmm, I see that was Bjarni's doing. Being from Iceland, he perhaps has
more of the spirit of Loki than most...
> > They are what we _don't_ want to warn about upon encountering them.
> >
> > If man-pages(7) or other people continue to call the practice of
> > breaking *roff input lines after sentence-ending punctuation
> > "semantic newlines", I have no complaint. It could also be called
> > "Kernighan breaking", in honor of an early popularizer of the
> > practice.
>
> You could use it for the warning name ;).
Not a chance. :P
As I noted, I want this under the "style" penumbra now, along with some
other bits of weirdness.
https://savannah.gnu.org/bugs/?62776
> > This is categorically not what regular expressions can cope with,
> > formally.
>
> Well, formally yes. And a regex can't find C function definitions in
> a source tree; at least if you try to fool it by writing the most
> horrible code in the universe. But I wrote a relatively small
> script[1] that finds a lot of C code with pcre2grep(1), and works most
> of the time. It has limitations; some of which can be fixed by
> improving the regexes (read: making them even more unreadable); some
> others are likely impossible to fix with a regex. The biggest
> limitation I think I've met is K&R-style functions: I don't think a
> regex can cope with them.
I don't know if you have to cope with "the lexer hack", but you might.
https://en.wikipedia.org/wiki/Lexer_hack
How much grief might have been saved if objects in C had been prefixed
with a sigil like $, or if types had been prefixed with %?
In my imagination, Thompson vetoed this, but when I consider it more
seriously, I reckon the truth is more complicated, and arises from C's
origins in the wholly untyped B language. The dialect of C we see in
Version 6 Unix (q.v. the Lions book) is shockingly loosely typed to
modern eyes. I once ground the productivity of my workplace to a halt
for an entire afternoon by presenting my colleagues with the attached
exhibit of "legal C". (It remained legal in AT&T USG Unix for many,
many years.)
> I believe a regex-based script can be good enough for some purposes,
> even if it's not perfect.
All of this is true, and I like programming languages that are dead
simple to lexically analyze. (But I spend next to no time working in
them.)
I'm strident on this point because I'm opposed to putting a diagnostic
into the formatter that throws false positives. That would disserve
users.
Regards,
Branden
legal_c.jpeg
Description: legal_c.jpeg
signature.asc
Description: PGP signature