groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 02/03: doc/groff.texi: Tweak hyphenation documentation.


From: G. Branden Robinson
Subject: [groff] 02/03: doc/groff.texi: Tweak hyphenation documentation.
Date: Sat, 25 Jul 2020 07:45:59 -0400 (EDT)

gbranden pushed a commit to branch master
in repository groff.

commit 909f2bf45ea0d667b7a2c8ff54e7bb1f0dd8ce3e
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Sat Jul 25 20:45:34 2020 +1000

    doc/groff.texi: Tweak hyphenation documentation.
    
    I still wasn't quite happy with it.
    
    * doc/groff.texi (Manipulating Hyphenation):
      + (.hw): Add a hyphenation-prevention case to example.  Add forward
        references to hyphenation language and environments.
      + (\%): Replace "on the fly" idiom with plainer wording.  Recast to
        make clearer than \% can be used multiple times within a word.
        Relocate \X and \Y caveat to be contiguous with other paragraphs,
        even though it is pretty far down in the weeds.
      + (\:): Add several conceptual index entries to help users learn what
        this escape is good for.
      + (.hc): Use groff-style special character escapes; there's no reason
        to use AT&T style here.
      + Eliminate a mention in introductory paragraph of hyphenation being
        suppressible at "column" ends--this turned out to require quite a
        bit of explanation.
      + (.hy): Interrupt hyphenation mode table to describe entries > 1 as
        being deltas relative to "1" (using less jargon).
    
        Describe what hyphenation mode 2 actually does in a lengthy
        footnote.  I asked myself, "how is it that there is a hyphenation
        mode that is cognizant of column endings when the core language
        itself isn't?" The answer involves page position traps.  Give
        readers a glimpse of the sausage factory.  Eliminate now-inaccurate
        forward reference to coverage of .hw.
    
        Prefix reference to TeX escape with TeX logo to prevent
        confusion with roff escapes, which also start with backslashes.
      + (.hpf): Recast language making it more obvious that hyphenation
        pattern files are looked up just like macro files.  Recast items
        in list of parsing restrictions on pattern files to be complete
        sentences.
      + (.hpfcode): Try to make it clearer that arguments to this request
        can only be numbers (no concrete examples are shown because ASCII
        vs. EBCDIC--sigh), in contrast to .hcode.  Stop trying to conceal
        the special semantics of a zero hyphenation code behind the veil of
        an "undefined hyphenation code"; the code has explicit tests for
        zero all over the place.
      + Do some minor wordsmithing.
      + Fix typos.
    
    * man/groff.7.man:
    * man/groff_diff.7.man: Sync with changes to Texinfo.  Put two empty
      requests prior to paragraphing macros as that is the house style.
---
 doc/groff.texi       | 167 +++++++++++++++++++++++++++++----------------------
 man/groff.7.man      |  53 +++++++++++++---
 man/groff_diff.7.man |  32 +++++-----
 3 files changed, 156 insertions(+), 96 deletions(-)

diff --git a/doc/groff.texi b/doc/groff.texi
index 558e2a3..ba399f0 100644
--- a/doc/groff.texi
+++ b/doc/groff.texi
@@ -6582,11 +6582,11 @@ right-justified is associated with the current 
environment
 @codequoteundirected on
 GNU @code{troff} hyphenates words automatically by default.  Automatic
 hyphenation of words in natural languages is a subject requiring
-algorithms and data, and susceptible to conventions and preferences.
+algorithms and data, and is susceptible to conventions and preferences.
 Before tackling automatic hyphenation, let us consider how it can be
 done manually.
 
-Explicitly hyphenated phrases such as ``mother-in-law'' are eligible for
+Explicitly hyphenated words such as ``mother-in-law'' are eligible for
 breaking after each of their hyphens.  Relatively few words in a
 language offer such obvious break points, however.  In a short document
 we may wish to disable automatic hyphenation and explicitly instruct GNU
@@ -6595,12 +6595,17 @@ we may wish to disable automatic hyphenation and 
explicitly instruct GNU
 @cindex hyphenation exceptions
 @Defreq {hw, word @dots{}}
 Define each @dfn{hyphenation exception} @var{word} with each hyphen `-'
-in the word indicating a hyphenation point.  For example:
+in the word indicating a hyphenation point.  For example, the request
 
 @Example
-.hw in-sa-lub-rious
+.hw in-sa-lub-rious alpha
 @endExample
 
+@c Serendipitously, in PDF output, the "alpha" below gets hyphenated.
+@c Try to preserve this felicity in future edits.
+marks potential hyphenation points in ``insalubrious'', and prevents
+``alpha'' from being hyphenated at all.
+
 @noindent
 Besides the space character, any character whose hyphenation code is
 zero can be used to separate the arguments of @code{hw} (see the
@@ -6612,21 +6617,22 @@ Hyphenation points specified with @code{hw} are not 
subject to the
 restrictions given by the @code{hy} request (see below).
 
 Hyphenation exceptions specified with the @code{hw} request are
-associated with the hyphenation language and environment; calling the
-@code{hw} request in the absence of a hyphenation language is an error.
+associated with the hyphenation language (see below) and environment
+(@pxref{Environments}); calling the @code{hw} request in the absence of
+a hyphenation language is an error.
 
-The request is ignored if there is no parameter.
+The request is ignored if there are no parameters.
 @endDefreq
 
 These are known as hyphenation @emph{exceptions} in the expectation that
 most users will avail themselves of automatic hyphenation; these
-exceptions override any rules that would nomally apply to a word
+exceptions override any rules that would normally apply to a word
 matching a hyphenation exception defined with @code{hw}.
 
 Situations also arise when only a specific occurrence of a word needs
-its hyphenation altered (or suppressed altogether), or when something
-that is not a word in a natural language, like a URL, needs to be broken
-in sensible places without hyphens.
+its hyphenation altered or suppressed, or when something that is not a
+word in a natural language, like a URL, needs to be broken in sensible
+places without hyphens.
 
 @DefescList {\\%, , , }
 @DefescListEndx {\:, , , }
@@ -6634,30 +6640,35 @@ in sensible places without hyphens.
 @cindex character, hyphenation (@code{\%})
 @cindex disabling hyphenation (@code{\%})
 @cindex hyphenation, disabling (@code{\%})
-To tell GNU @code{troff} how to hyphenate words on the fly, use the
-@code{\%} escape, also known as the @dfn{hyphenation character}.
-Preceding a word with this character prevents it from being
-hyphenated; putting it inside a word indicates to GNU @code{troff} that
-the word may be hyphenated at that point.  Note that this mechanism
-only affects that one occurrence of the word; to change the
-hyphenation of a word for the entire document, use the @code{hw}
-request.
+To tell GNU @code{troff} how to hyphenate words as they occur in input,
+use the @code{\%} escape, also known as the @dfn{hyphenation character}.
+Preceding a word with this character prevents it from being hyphenated;
+each instance within a word indicates to GNU @code{troff} that the word
+may be hyphenated at that point.  Note that this mechanism only affects
+that occurrence of the word; to change the hyphenation of a word for the
+entire document, use the @code{hw} request.
 
+@cindex @code{\X}, followed by @code{\%}
+@cindex @code{\Y}, followed by @code{\%}
+@cindex @code{\%}, following @code{\X} or @code{\Y}
+Note that the escapes @code{\X} and @code{\Y} start a word; that is, the
+@code{\%} escape in (say) @w{@samp{\X'...'\%foobar}} or
+@w{@samp{\Y'...'\%foobar}} no longer prevents hyphenation but inserts a
+hyphenation point at the beginning of @samp{foobar}; most likely this
+isn't what you want to do.  @xref{Postprocessor Access}.
+
+@cindex hyphenless breaks (@code{\:})
+@cindex breaking without hyphens (@code{\:})
+@cindex file names, breaking (@code{\:})
+@cindex breaking file names (@code{\:})
+@cindex URLs, breaking (@code{\:})
+@cindex breaking URLs (@code{\:})
 The @code{\:} escape inserts a zero-width break point; that is, the word
 can break there, but no hyphen is written to the output if it does.
 
 @Example
 @dots{} check the /var/log/\:httpd/\:access_log file @dots{}
 @endExample
-
-@cindex @code{\X}, followed by @code{\%}
-@cindex @code{\Y}, followed by @code{\%}
-@cindex @code{\%}, following @code{\X} or @code{\Y}
-Note that @code{\X} and @code{\Y} start a word; that is, the @code{\%}
-escape in (say) @w{@samp{\X'...'\%foobar}} or
-@w{@samp{\Y'...'\%foobar}} no longer prevents hyphenation but inserts a
-hyphenation point at the beginning of @samp{foobar}; most likely this
-isn't what you want to do.
 @endDefesc
 
 @Defreq {hc, [@Var{char}]}
@@ -6679,32 +6690,33 @@ The hyphenation character is associated with the 
current environment
 @cindex @code{char} request, and soft hyphen character
 @cindex @code{tr} request, and soft hyphen character
 Set the @dfn{soft hyphen character} to @var{glyph}.@footnote{``Soft
-hyphen @emph{character}'' is a misnomer since it is an output glyph.}
-If the argument is omitted, the soft hyphen character is set to the
-default, @code{\(hy}.  The @dfn{soft hyphen character} is the glyph that
-is inserted when a word is automatically hyphenated at a line
-break.@footnote{It is ``soft'' because it only appears in ouput where
+hyphen @emph{character}'' is a misnomer since it is an output glyph.} If
+the argument is omitted, the soft hyphen character is set to the
+default, @code{\[hy]}.  The @dfn{soft hyphen character} is the glyph
+that is inserted when a word is automatically hyphenated at a line
+break.@footnote{It is ``soft'' because it only appears in output where
 hyphenation is actually performed; a ``hard'' hyphen, as in
 ``long-term'', always appears.}  If the soft hyphen character does not
 exist in the font of the character immediately preceding a potential
 break point, then the line is not broken at that point.  Neither
 definitions (specified with the @code{char} request) nor translations
-(specified with the @code{tr} request) are considered when assigning
-the soft hyphen character.
+(specified with the @code{tr} request) are considered when assigning the
+soft hyphen character.
 @endDefreq
 
 @cindex hyphenation, automatic
 Several requests influence automatic hyphenation.  Because conventions
 vary, a variety of hyphenation modes are available to the @code{hy}
 request; these determine whether automatic hyphenation will apply to a
-word prior to breaking a line at the end of a column or page, and at
-which positions within that word hyphenation is permissible.  The places
-within a word that are eligible for hyphenation are determined by
-language-specific data and lettercase relationships.  Furthermore,
-hyphenation of a word might be suppressed because too many previous
-lines have been hyphenated (@code{hlm}), the line has not reached a
-certain minimum length (@code{hym}), or the line can instead be adjusted
-with up to a certain amount of additional inter-word space (@code{hys}).
+word prior to breaking a line at the end of a page (more or less; see
+below for details), and at which positions within that word hyphenation
+is permissible.  The places within a word that are eligible for
+hyphenation are determined by language-specific data and lettercase
+relationships.  Furthermore, hyphenation of a word might be suppressed
+because too many previous lines have been hyphenated (@code{hlm}), the
+line has not reached a certain minimum length (@code{hym}), or the line
+can instead be adjusted with up to a certain amount of additional
+inter-word space (@code{hys}).
 
 @DefreqList {hy, [@Var{mode}]}
 @DefregListEndx {.hy}
@@ -6728,9 +6740,22 @@ disables hyphenation.
 enables hyphenation except after the first and before the last character
 of a word; this is the default if @var{mode} is omitted and also the
 start-up value of GNU @code{troff}.
+@end table
 
+The remaining values ``imply'' 1; that is, they enable hyphenation
+under the same conditions as @samp{.hy 1}, and then apply or lift
+restrictions relative to that basis.
+
+@table @code
 @item 2
-disables hyphenation of the last word on a page or column.
+disables hyphenation of the last word on a page.@footnote{Technically,
+this value prevents hyphenation if the next page position trap is closer
+than the next line of text would be.  GNU @code{troff} automatically
+inserts an implicit page position trap at the end of each page to cause
+a page transition.  This value can be used in traps planted by users or
+macro packages to prevent hyphenation of the last word in a column in
+multi-column page layouts or before floating figures or tables.
+@xref{Traps}.}
 
 @item 4
 disables hyphenation before the last two characters of a word.
@@ -6748,16 +6773,14 @@ enables hyphenation after the first character of a word.
 Note that any restrictions imposed by the hyphenation mode are
 @emph{not} respected for words whose hyphenations have been explicitly
 specified with the hyphenation character (@samp{\%} by default) or the
-@code{hw} request (see below).
+@code{hw} request.
 
 The nonzero values in the previous table are additive.  For example,
 value@tie{}12 causes GNU @code{troff} to hyphenate neither the last two
-nor the first two characters of a word.  Note that value@tie{}13 would
-do exactly the same; in other words, value@tie{}1 need not be added if
-the value is larger than@tie{}1.
-
-Some values cannot be used together because they contradict; for
-instance, values 4 and@tie{}16, and values 8 and@tie{}32.
+nor the first two characters of a word.  Some values cannot be used
+together because they contradict; for instance, values 4 and@tie{}16,
+and values 8 and@tie{}32.  As noted, it is superfluous to add 1 to any
+other positive value.
 
 @cindex hyphenation pattern files
 @cindex pattern files, for hyphenation
@@ -6810,8 +6833,9 @@ files.
 @end multitable
 
 Hyphenation exceptions within pattern files (i.e., the words within a
-@code{\hyphenation} group) also obey the hyphenation restrictions given
-by @code{hy}.  However, exceptions specified with @code{hw} do not.
+@TeX{} @code{\hyphenation} group) also obey the hyphenation restrictions
+given by @code{hy}.  However, exceptions specified with @code{hw} do
+not.
 
 The hyphenation mode is associated with the current environment
 (@pxref{Environments}).
@@ -6833,8 +6857,8 @@ not remembered.
 @cindex hyphenation patterns (@code{hpf})
 @cindex patterns for hyphenation (@code{hpf})
 Read hyphenation patterns from @var{pattern-file}.  This file is sought
-in the same way as @file{@var{name}.tmac} (or @file{tmac.@var{name}}) if
-the @option{-m@var{name}} option is specified.
+in the same way that macro files are with the @code{mso} request or the
+@option{-m@var{name}} command-line option to @code{groff}.
 
 The @var{pattern-file} should have the same format as (simple) @TeX{}
 pattern files.  More specifically, the following scanning rules are
@@ -6846,7 +6870,7 @@ A percent sign starts a comment (up to the end of the 
line) even if
 preceded by a backslash.
 
 @item
-No support for `digraphs' like @code{\$}.
+``Digraphs'' like @code{\$} are not supported.
 
 @item
 @code{^^@var{xx}} (where each @var{x} is 0--9 or a--f) and
@@ -6854,7 +6878,7 @@ No support for `digraphs' like @code{\$}.
 decimal) are recognized; other uses of @code{^} cause an error.
 
 @item
-No macro expansion.
+No macro expansion is performed.
 
 @item
 @code{hpf} checks for the expression @code{\patterns@{@dots{}@}}
@@ -6885,7 +6909,8 @@ codes---integers from 0 to@tie{}255.  The request maps 
character
 code@tie{}@var{a} to code@tie{}@var{b}, code@tie{}@var{c} to
 code@tie{}@var{d}, and so on.  Character codes that would otherwise be
 invalid in GNU @code{troff} can be used.  By default, every code maps to
-itself except letters `A' to `Z', which map to `a' to `z'.
+itself except those for letters `A' to `Z', which map those for to `a'
+to `z'.
 
 @pindex troffrc
 @pindex troffrc-end
@@ -6917,18 +6942,18 @@ input character (not a special character) other than a 
digit or a space.
 The request is ignored if it has no parameters.
 
 For hyphenation to work, hyphenation codes must be set up.  At
-start-up, GNU @code{troff} assigns hyphenation codes only to the letters
-@samp{a}--@samp{z} (mapped to themselves) and to the letters
-@samp{A}--@samp{Z} (mapped to @samp{a}--@samp{z}); all other characters
-have undefined hyphenation codes.  Normally, hyphenation patterns
-contain only lowercase letters which should be applied regardless of
-case.  In other words, they assume that the words `FOO' and `Foo' should
-be hyphenated exactly as `foo' is.  The @code{hcode} request extends
-this principle to letters outside the Unicode basic Latin alphabet;
-without it, words containing such letters won't be hyphenated properly
-even if the corresponding hyphenation patterns contain them.  For
-example, the following @code{hcode} requests are necessary to assign
-hyphenation codes to the letters @samp{�������} (needed for German):
+start-up, GNU @code{troff} assigns hyphenation codes to the letters
+@samp{a}--@samp{z} (mapped to themselves), to the letters
+@samp{A}--@samp{Z} (mapped to @samp{a}--@samp{z}), and zero to all other
+characters.  Normally, hyphenation patterns contain only lowercase
+letters which should be applied regardless of case.  In other words,
+they assume that the words `FOO' and `Foo' should be hyphenated exactly
+as `foo' is.  The @code{hcode} request extends this principle to letters
+outside the Unicode basic Latin alphabet; without it, words containing
+such letters won't be hyphenated properly even if the corresponding
+hyphenation patterns contain them.  For example, the following
+@code{hcode} requests are necessary to assign hyphenation codes to the
+letters @samp{�������} (needed for German):
 
 @Example
 .hcode � �  � �
@@ -7023,7 +7048,7 @@ register.
 Suppress hyphenation of the line in adjustment modes @samp{b} or
 @samp{n} if it can be justified by adding no more than
 @var{hyphenation-space} extra space to each word space.  Without an
-argument, the hyphenation space adjustment threshole is set to its
+argument, the hyphenation space adjustment threshold is set to its
 default value, 0.  The default scaling indicator is @samp{m}.  The
 hyphenation space adjustment threshold is associated with the current
 environment (@pxref{Environments}).
diff --git a/man/groff.7.man b/man/groff.7.man
index 3cc14a8..4b3982c 100644
--- a/man/groff.7.man
+++ b/man/groff.7.man
@@ -4574,55 +4574,90 @@ a variety of hyphenation modes are available to the
 .B .hy
 request;
 these determine whether automatic hyphenation will apply to a word prior
-to breaking a line at the end of a column or page,
+to breaking a line at the end of a page
+(more or less;
+see below for details),
 and at which positions within that word hyphenation is permissible.
 .
 The default is
 .RB \[lq] 1 \[rq]
 for historical reasons,
-but macro packages often override this default.
+but this is not an appropriate value for the U.S.\& English hyphenation
+patterns used by
+.IR groff ,
+and macro packages often override it.
 .
 .
 .TP
 .B 0
 disables hyphenation.
 .
+.
 .TP
 .B 1
 enables hyphenation except after the first and before the last character
 of a word.
 .
+.
+.P
+The remaining values \[lq]imply\[rq]
+.BR 1 ;
+that is,
+they enable hyphenation under the same conditions as
+.RB \[lq] ".hy 1" \[rq],
+and then apply or lift restrictions relative to that basis.
+.
+.
 .TP
 .B 2
-disables hyphenation of the last word on a page or column.
+disables hyphenation of the last word on a page.
+.
+(Technically,
+this value prevents hyphenation if the next page position trap is closer
+than the next line of text would be.
+.
+.I groff
+automatically inserts an implicit page position trap at the end of each
+page to cause a page transition.
+.
+This value can be used in traps planted by users or macro packages to
+prevent hyphenation of the last word in a column in multi-column page
+layouts or before floating figures or tables.)
+.\" If this page ever grows a Traps section, cross-reference it here.
+.
 .
 .TP
 .B 4
 disables hyphenation before the last two characters of a word.
 .
+.
 .TP
 .B 8
 disables hyphenation after the first two characters of a word.
 .
+.
 .TP
 .B 16
 enables hyphenation before the last character of a word.
 .
+.
 .TP
 .B 32
 enables hyphenation after the first character of a word.
 .
+.
 .P
 Note that any restrictions imposed by the hyphenation mode are
 .I not
-respected for words whose hyphenations have been explicitly
-specified with the hyphenation character
+respected for words whose hyphenations have been explicitly specified
+with the hyphenation character
 .RB (\[lq] \[rs]% \[rq]
 by default)
 or the
 .B .hw
 request.
 .
+.
 .P
 The nonzero values above are additive.
 .
@@ -4632,16 +4667,14 @@ value\~12 causes
 to hyphenate neither the last two nor the first two characters of a
 word.
 .
-Note that value\~13 would do exactly the same;
-in other words,
-value\~1 need not be added if the value is larger than\~1.
-.
-.P
 Some values cannot be used together because they contradict;
 for instance,
 values 4 and\~16,
 and values 8 and\~32.
 .
+As noted,
+it is superfluous to add\~1 to any other positive value.
+.
 .
 .P
 The places within a word that are eligible for hyphenation are
diff --git a/man/groff_diff.7.man b/man/groff_diff.7.man
index 2e96c14..4635e2b 100644
--- a/man/groff_diff.7.man
+++ b/man/groff_diff.7.man
@@ -1966,11 +1966,11 @@ hyphenation codes must be set up.
 .
 At start-up,
 .I groff
-assigns hyphenation codes only to the letters \[lq]a\[en]z\[rq]
+assigns hyphenation codes to the letters \[lq]a\[en]z\[rq]
 (mapped to themselves)
 and to the letters \[lq]A\[en]Z\[rq]
-(mapped to \[lq]a\[en]z\[rq]);
-all other characters have undefined hyphenation codes.
+(mapped to \[lq]a\[en]z\[rq])
+and zero to all other characters.
 .
 Normally,
 hyphenation patterns contain only lowercase letters which should be
@@ -2084,14 +2084,13 @@ are counted; explicit hyphens are not.
 Read hyphenation patterns from
 .IR pattern-file .
 .
-This file sought in the same way as
-.IB name .tmac
-(or
-.BI tmac. name\c
-)
-is searched for when the
+This file is sought in the same way that macro files are with the
+.B .mso
+request or the
 .BI \-m name
-option is specified.
+command-line option to
+.IR groff (@MAN1EXT@).
+.
 .
 .IP
 The
@@ -2101,6 +2100,7 @@ should have the same format as (simple) \*[tx] pattern 
files.
 More specifically,
 the following scanning rules are implemented.
 .
+.
 .RS
 .IP \[bu]
 A percent sign starts a comment
@@ -2109,8 +2109,9 @@ even if preceded by a backslash.
 .
 .
 .IP \[bu]
-No support for \[lq]digraphs\[rq] like
-.BR \[rs]$ .
+\[lq]Digraphs\[rq] like
+.B \[rs]$
+are not supported.
 .
 .
 .IP \[bu]
@@ -2129,7 +2130,7 @@ cause an error.
 .
 .
 .IP \[bu]
-No macro expansion.
+No macro expansion is performed.
 .
 .
 .IP \[bu]
@@ -2266,8 +2267,9 @@ Character codes that would otherwise be invalid in
 can be used.
 .
 By default,
-every code maps to itself except letters \[lq]A\[rq] to \[lq]Z\[rq],
-which map to \[lq]a\[rq] to \[lq]z\[rq].
+every code maps to itself those for except letters \[lq]A\[rq] to
+\[lq]Z\[rq],
+which map to those for \[lq]a\[rq] to \[lq]z\[rq].
 .
 .
 .TP



reply via email to

[Prev in Thread] Current Thread [Next in Thread]