[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #62593] clarify description of end-of-sentence detection

From: G. Branden Robinson
Subject: [bug #62593] clarify description of end-of-sentence detection
Date: Wed, 15 Jun 2022 02:09:56 -0400 (EDT)

Follow-up Comment #9, bug #62593 (project groff):

[comment #6 comment #6:]
> [comment #5 comment #5:]
> Perhaps for completeness, but I think those two terms are intuitive enough
to be used before a formal definition.  (And cross references can always point
forward, in case anyone is confused about the terms.)  The text can say
"ordinary space characters" without needing to bring in the concept of escapes
yet; the reader who goes through the manual linearly (and such readers are
probably rare anyway) may wonder "what's a NONordinary space?," but this won't
trip up their understanding of the point being made here.  And the reader
who's looking up this section for reference will understand immediately that
there are other types of spaces.

I think I've found a way to address both Ingo's and my concerns.
> > page space (in DVI and PDF) is at a premium in this part of
> > the manual.
> Happily, the supply-chain issues limiting the supply of so many other things
have not hit PDF pages yet.  And I bet the set of people who will print out
the PDF manual is quite small.

Yes, but I also want the pages to _look_ nice.  I did get sucked into
development work on a typesetting system...
> > A CSTR#54-ish definition of sentence-ending detection that has
> > free recourse to the panoply of *roff jargon is better
> > situated in groff(7).
> Yes, but as long as the Texinfo manual advertises itself as the most
complete reference, the detail does need to be there as well.

> In any case, I'm not sure how important the first change is; I think most
readers will assume the unmodified word "spaces" means the spaces you get when
you hit the keyboard's spacebar.


> The second change ("end of an input line") does seem a useful


Here's what I've got pending.  It's in word diff format.

diff --git a/doc/groff.texi b/doc/groff.texi
index 742f4cedc..56f39d833 100644
--- a/doc/groff.texi
+++ b/doc/groff.texi
@@ -4862,21 +4862,22 @@ inter-sentence space.

When GNU @code{troff} starts up, it obtains information about the device
for which it is preparing output.@footnote{@xref{Device and Font
Description Files}.}  [-A crucial example-]{+An essential property+} is the
length of the output
line, such as ``6.5 inches''.

@cindex word, definition of
@cindex filling
GNU @code{troff} {+interprets plain text files employing the Unix+}
{+line-ending convention.  It+} reads[-its-] input {+a+} character [-by
character,-]{+at a time,+}
collecting words as it goes, and fits as many words together on an
output line as it can---this is known as @dfn{filling}.  To GNU
@code{troff}, a @dfn{word} is any sequence of one or more characters
that aren't spaces, tabs, or newlines.  The exceptions separate
words.@footnote{There are also @emph{escape sequences} which can
function as word characters, word separators, or neither---the last
simply have no effect on GNU @code{troff}'s idea of whether an input
character is within a word or not.}  To disable filling, see
@ref{Manipulating Filling and Adjustment}.

It is a truth universally acknowledged
@@ -5110,8 +5111,8 @@ This is discussed in @ref{Manipulating Filling and
@subsection Adjustment

@cindex extra spaces between words
After GNU @code{troff} performs an automatic[-line-] break, it then tries to
@dfn{adjust} the line: inter-word spaces are widened until the text
reaches the right margin.  Extra spaces between words are preserved.
Leading and trailing spaces are handled as noted above.  Text can be
aligned to the left or right margins or centered; see @ref{Manipulating
@@ -5467,24 +5468,23 @@ traditions have accrued in service of these goals.
@itemize @bullet
Follow sentence endings in input with newlines to ease their
recognition (@pxref{Sentences}).  It is frequently convenient to
{+input lines+} after colons and semicolons as well, as these typically
precede independent clauses.  Consider [-breaking-]{+doing so+} after commas;
they often
occur in lists that become easy to scan when itemized by line, or
constitute supplements to the sentence that are added, deleted, or
updated to clarify it.  Parenthetical and quoted phrases are also good
candidates for placement on input lines by themselves.[-In filled text, spaces
[-newlines are interchangeable; place breaks where it aids your purpose.-]

Set your text editor's line length to 72 characters or
fewer.@footnote{Emacs: @code{fill-column: 72}; Vim: @code{textwidth=72}}
This limit, combined with the previous [-advice regarding breaking around-]
[-punctuation,-]{+item of advice,+} makes it less
common that an input line will wrap in your text editor, and thus will
elp you perceive excessively long constructions in your text.  Recall
that natural languages originate in speech, not writing, and that
punctuation is correlated with pauses for breathing and changes in

Use @code{\&} after @samp{!}, @samp{?}, and @samp{.} if they are


Reply to this item at:


Message sent via Savannah

reply via email to

[Prev in Thread] Current Thread [Next in Thread]