groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 06/06: [docs]: Reduce use of term "entity".


From: G. Branden Robinson
Subject: [groff] 06/06: [docs]: Reduce use of term "entity".
Date: Tue, 25 Apr 2023 01:15:07 -0400 (EDT)

gbranden pushed a commit to branch master
in repository groff.

commit 0223aef4164a7b07cb933a397894878bb61773b5
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Mon Apr 24 19:58:08 2023 -0500

    [docs]: Reduce use of term "entity".
    
    Doug McIlroy noted this vague term, which groff employs for multiple
    purposes.  Eliminate its application to input processing.  There is now
    no longer such a thing as an "entity" in the groff language.
    
    * doc/groff.texi (Character Translations): Do it.  Also clarify
      "nothing" as "the dummy character".
    
      (Using Symbols): Do it.  Also recast explanation of difference between
      characters and glyphs.  Explicitly state that spaces aren't glyphs.
      Document that `rchar` request can't remove definitions supplied by
      font description files.
    
      (Ligatures and Kerning): Speak of "special characters", not
      "entities".
    
      (Other Differences): Recast discussion of character-to-glyph
      transformation.  Stop qualifying characters as "input".  Recast
      discussion of example.
    
    * font/devutf8/NOTES: Revise use of terminology.  Perform a Kemper
      notectomy.  Wrap long lines.
    
    * man/groff.7.man (Request short reference) <char>: Speak of a "special
      character", not an "entity".
    
      <rchar>: Document that request can't remove definitions supplied by
      font description files.
    
    * man/groff_diff.7.man (Implementation differences): Sync with our
      Texinfo manual.
    
    The use of "entity" to describe how a glyph gets mapped back to a
    character (sequence) for the HTML and terminal output devices is
    retained.  That usage is restricted to discussion of output drivers
    (code comments and function names notwithstanding).
---
 doc/groff.texi       | 86 +++++++++++++++++++++++++++-------------------------
 font/devutf8/NOTES   | 34 ++++++++++-----------
 man/groff.7.man      | 22 +++++++++-----
 man/groff_diff.7.man | 73 ++++++++++++++++++++++++--------------------
 4 files changed, 116 insertions(+), 99 deletions(-)

diff --git a/doc/groff.texi b/doc/groff.texi
index 67f2d31f8..7cf609238 100644
--- a/doc/groff.texi
+++ b/doc/groff.texi
@@ -6360,10 +6360,10 @@ not qualify, so our first attempt got a warning.
 @section Identifiers
 @cindex identifiers
 
-An @dfn{identifier} is a label for an object of syntactical
-importance:@: a register, name (macro, string, or diversion), typeface,
-color, special character, character class, environment, or stream.
-Valid identifiers consist of one or more ordinary characters.
+An @dfn{identifier} labels a GNU @code{troff} datum such as a register,
+name (macro, string, or diversion), typeface, color, special character,
+character class, environment, or stream.  Valid identifiers consist of
+one or more ordinary characters.
 @cindex ordinary character
 @cindex character, ordinary
 An @slanted{ordinary character} is an input character that is not a
@@ -9371,7 +9371,7 @@ foo bar
 @endExample
 
 @noindent
-It is even possible to map the space character to nothing:
+Even the space character can be mapped to the dummy character.
 
 @Example
 .tr aa \&
@@ -9397,8 +9397,8 @@ affected by @code{tr}.
 
 @item
 Translating character to glyphs where one of them or both are undefined
-is possible also; @code{tr} does not check whether the entities in its
-argument do exist.
+is possible also; @code{tr} does not check whether the elements of its
+argument exist.
 
 @xref{Gtroff Internals}.
 
@@ -10527,13 +10527,16 @@ this is font 1 again
 @cindex character, distinguished from glyph
 @cindex ligature
 A @dfn{glyph} is a graphical representation of a @dfn{character}.  While
-a character is an abstract entity containing semantic information, a
-glyph is something that can be actually seen on screen or paper.  It is
-possible that a character has multiple glyph representation forms (for
-example, the character `A' can be either written in a roman or an italic
-font, yielding two different glyphs); sometimes more than one character
-maps to a single glyph (this is a @dfn{ligature}---the most common is
-`fi').
+a character is an abstraction of semantic information, a glyph is
+something that can be seen on screen or paper.  A character has many
+possible representation forms (for example, the character `A' can be
+written in an upright or slanted typeface, producing distinct
+glyphs).  Sometimes, a sequence of characters map to a single glyph:@:
+this is a @dfn{ligature}---the most common is `fi'.
+
+Space characters never become glyphs in GNU @code{troff}.  If not
+discarded (as when trailing on text lines), they are represented by
+horizontal motions in the output.
 
 @cindex symbol
 @cindex special fonts
@@ -11064,16 +11067,15 @@ request, but before the already mounted special fonts.
 @xref{Character Classes}.
 @endDefreq
 
-@DefreqList {rchar, c1 c2 @dots{}}
-@DefreqListEndx {rfschar, f c1 c2 @dots{}}
+@DefreqList {rchar, c @dots{}}
+@DefreqListEndx {rfschar, f c @dots{}}
 @cindex removing glyph definition (@code{rchar}, @code{rfschar})
 @cindex glyph, removing definition (@code{rchar}, @code{rfschar})
 @cindex fallback glyph, removing definition (@code{rchar}, @code{rfschar})
-Remove the definitions of glyphs @var{c1}, @var{c2},@tie{}@dots{},
+Remove definition of each ordinary or special character @var{c},
 undoing the effect of a @code{char}, @code{fchar}, or @code{schar}
-request.
-
-Spaces and tabs are optional between @var{cn}@tie{}arguments.
+request.  Those supplied by font description files cannot be removed.
+Spaces and tabs may separate @var{c}@tie{}arguments.
 
 The request @code{rfschar} removes glyph definitions defined with
 @code{fschar} for font@tie{}@var{f}.
@@ -11399,8 +11401,8 @@ supported `ff', `ffi', and `ffl' ligatures.  Advanced 
typesetters or
 @code{troff} does not support these (yet).
 
 Only the current font is checked for ligatures and kerns; neither
-special fonts nor entities defined with the @code{char} request (and its
-siblings) are taken into account.
+special fonts nor special charcters defined with the @code{char} request
+(and its siblings) are taken into account.
 
 @DefreqList {lg, [@Var{flag}]}
 @DefregListEndx {.lg}
@@ -17217,21 +17219,20 @@ each rounded down to the nearest multiple of@tie{}12.
 @cindex characters, input, and output glyphs, compatibility with 
@acronym{AT&T} @code{troff}
 @cindex glyphs, output, and input characters, compatibility with 
@acronym{AT&T} @code{troff}
 In GNU @code{troff} there is a fundamental difference between
-(unformatted) input characters and (formatted) output glyphs.
-Everything that affects how a glyph is output is stored with the glyph
-node; once a glyph node has been constructed, it is unaffected by any
-subsequent requests that are executed, including @code{bd}, @code{cs},
-@code{tkf}, @code{tr}, or @code{fp} requests.  Normally, glyphs are
-constructed from input characters immediately before the glyph is added
-to the current output line.  Macros, diversions, and strings are all, in
-fact, the same type of object; they contain lists of input characters
-and glyph nodes in any combination.  Special characters can be both:
-before being added to the output, they act as input entities;
-afterward, they denote glyphs.  A glyph node does not behave like an
-input character for the purposes of macro processing; it does not
-inherit any of the special properties that the input character from
-which it was constructed might have had.  Consider the following
-example.
+(unformatted) characters and (formatted) glyphs.  Everything that
+affects how a glyph is output is stored with the glyph node; once a
+glyph node has been constructed, it is unaffected by any subsequent
+requests that are executed, including @code{bd}, @code{cs}, @code{tkf},
+@code{tr}, or @code{fp} requests.  Normally, glyphs are constructed from
+characters immediately before the glyph is added to an output line.
+Macros, diversions, and strings are all, in fact, the same type of
+object; they contain a sequence of intermixed character and glyph nodes.
+Special characters transform from one to the other:@: before being added
+to the output, they behave as characters; afterward, they are glyphs.  A
+glyph node does not behave like a character node when it is processed by
+a macro:@: it does not inherit any of the special properties that the
+character from which it was constructed might have had.  For example,
+the input
 
 @Example
 .di x
@@ -17242,11 +17243,12 @@ example.
 @endExample
 
 @noindent
-It prints @samp{\\} in GNU @code{troff}; each pair of input backslashes
-is turned into one output backslash and the resulting output backslashes
-are not interpreted as escape characters when they are reread.
-@acronym{AT&T} @code{troff} would interpret them as escape characters
-when they were reread and would end up printing one @samp{\}.
+produces @samp{\\} in GNU @code{troff}.  Each pair of backslashes
+becomes one backslash @emph{glyph}; the resulting backslashes are thus
+not interpreted as escape @emph{characters} when they are reread as the
+diversion is output.  @acronym{AT&T} @code{troff} @emph{would} interpret
+them as escape characters when rereading them and end up printing one
+@samp{\}.
 
 @cindex printing backslash (@code{\\}, @code{\e}, @code{\E}, @code{\[rs]})
 @cindex backslash, printing (@code{\\}, @code{\e}, @code{\E}, @code{\[rs]})
diff --git a/font/devutf8/NOTES b/font/devutf8/NOTES
index 0906b8ade..c20edebd4 100644
--- a/font/devutf8/NOTES
+++ b/font/devutf8/NOTES
@@ -1,47 +1,47 @@
-Note that all \[charXXX] entity names have been removed from the font files.
-They don't make sense for Unicode.
+All \[charXXX] special character names have been removed from the font
+files.  They don't make sense for Unicode.
 
-The following entity from the original troff manual (by Ossanna and
-Kernighan) is unmapped:
+The following special character name from the AT&T troff manual by
+Ossanna and Kernighan is unmapped:
 
   bs    shaded solid ball (Bell System logo, AT&T logo)
 
-Character 0x002D has not been given a name because its Unicode name
+Code point 0x002D has not been given a name because its Unicode name
 HYPHEN-MINUS is so ambiguous that it is unusable for serious typographic
-use.
+work.
 
 \[wp] has been mapped to 0x2118, because according to Unicode 4.1's
 NamesList.txt, U+2118 SCRIPT CAPITAL P is really a Weierstrass 'p',
 neither SCRIPT nor CAPITAL.
 
 The following line could be added; \[space] is known to devps but is not
-documented and not known to devdvi (actually, there is no space glyph within
-the TeX system).
+documented and not known to devdvi (actually, there is no space glyph
+within the TeX system).
 
   space   24   0   0x0020
 
-devps maps \[*U] to 'Upsilon1', which is equivalent to 0x03D2.  We map it to
-0x03A5 instead.
+devps maps \[*U] to 'Upsilon1', which is equivalent to 0x03D2.  We map
+it to 0x03A5 instead.
 
-devps maps \[*W] to 'Omega', which is equivalent to either 0x2126 or 0x03A9.
-We map it to 0x03A9.
+devps maps \[*W] to 'Omega', which is equivalent to either 0x2126 or
+0x03A9.  We map it to 0x03A9.
 
-devps maps \[*D] to 'Delta', which is equivalent to either 0x2206 or 0x0394.
-We map it to 0x0394.
+devps maps \[*D] to 'Delta', which is equivalent to either 0x2206 or
+0x0394.  We map it to 0x0394.
 
 
 Adding Unicode characters
 -------------------------
 
 Assume you want to use a Unicode character not provided in the list, say
-U+20AC. You need to do two things:
+U+20AC.  You need to do two things:
 
 - Add a line
 
     u20AC   24   0   0x20AC
 
   (the second column is computed as 24 * wcwidth(0x20AC)) to the file
-  R.proto, or, when groff is already installed, to the four fonts files in
-  $(prefix)/share/groff/<version>/font/devutf8/.
+  R.proto, or, when groff is already installed, to the four font
+  description files in $(prefix)/share/groff/<version>/font/devutf8/.
 
 - In your source file, use the notation \[u20AC] to access it.
diff --git a/man/groff.7.man b/man/groff.7.man
index 07944567d..af1ebbbb2 100644
--- a/man/groff.7.man
+++ b/man/groff.7.man
@@ -1254,8 +1254,11 @@ containing them is surrounded by parentheses.
 .\" ====================================================================
 .
 .\" BEGIN Keep (roughly) parallel with groff.texi node "Identifiers".
-An identifier is a label for an object of syntactical importance:
-a register,
+An
+.I identifier
+labels a GNU
+.I troff \" GNU
+datum such as a register,
 name
 (macro,
 string,
@@ -2581,7 +2584,7 @@ Reset no-break control character to
 .REQ .c2 "o"
 Recognize ordinary character
 .I o
-as the no-break control character.
+as no-break control character.
 .
 .TPx
 .REQ .cc
@@ -2646,7 +2649,7 @@ by moving its location to
 .
 .TPx
 .REQ .char "c contents"
-Define entity
+Define ordinary or special character
 .I c
 as
 .IR contents .
@@ -3800,11 +3803,16 @@ Change post-vertical line spacing according to
 .scaleindicator p ).
 .
 .TPx
-.REQ .rchar "c1 c2 \fR\&.\|.\|.\&\fP"
-Remove the definitions of entities
+.REQ .rchar "c1 c2 \fR.\|.\|.\&\fP"
+Remove definition of each ordinary or special character
 .IR c1 ,
 .IR c2 ,
-\&.\|.\|.\&
+\&.\|.\|.\& defined by a
+.request .char ,
+.request .fchar ,
+or
+.request .schar
+request.
 .
 .TPx
 .REQ .rd "prompt"
diff --git a/man/groff_diff.7.man b/man/groff_diff.7.man
index e1a17c598..56770f0ef 100644
--- a/man/groff_diff.7.man
+++ b/man/groff_diff.7.man
@@ -5453,45 +5453,48 @@ each rounded down to the nearest multiple of\~12.
 .
 .
 .P
-In
-.IR groff ,
-there is a fundamental difference between unformatted input
-characters, and formatted output characters (glyphs).
+In GNU
+.I troff \" GNU
+there is a fundamental difference between (unformatted) characters and
+(formatted) glyphs.
 .
-Everything that affects how a glyph is output is stored with the glyph;
-once a glyph has been constructed,
+Everything that affects how a glyph is output is stored with the glyph
+node;
+once a glyph node has been constructed,
 it is unaffected by any subsequent requests that are executed,
-including the
-.BR .bd ,
-.BR .cs ,
-.BR .tkf ,
-.BR .tr ,
+including
+.BR bd ,
+.BR cs ,
+.BR tkf ,
+.BR tr ,
 or
-.B .fp
+.B fp
 requests.
 .
 Normally,
-glyphs are constructed from input characters immediately before the
-glyph is added to the current output line.
+glyphs are constructed from characters immediately before the glyph is
+added to an output line.
 .
 Macros,
 diversions,
 and strings are all,
 in fact,
 the same type of object;
-they contain lists of input characters and glyphs in any combination.
+they contain a sequence of intermixed character and glyph nodes.
 .
-Special characters can be both: before being added to the output,
-they act as input entities;
-afterwards,
-they denote glyphs.
+Special characters transform from one to the other:
+before being added to the output,
+they behave as characters;
+afterward,
+they are glyphs.
 .
-A glyph does not behave like an input character for the purposes of
-macro processing;
-it does not inherit any of the special properties that the input
-character from which it was constructed might have had.
+A glyph node does not behave like a character node when it is processed
+by a macro:
+it does not inherit any of the special properties that the character
+from which it was constructed might have had.
 .
-Consider the following example.
+For example,
+the input
 .
 .RS
 .EX
@@ -5503,17 +5506,21 @@ Consider the following example.
 .EE
 .RE
 .
-It prints
+produces
 .RB \[lq] \[rs]\[rs] \[rq]
-in
-.IR groff ;
-each pair of input backslashes is turned into one output backslash and
-the resulting output backslashes are not interpreted as escape
-characters when they are reread.
+in GNU
+.IR troff . \" GNU
+Each pair of backslashes becomes one backslash
+.I glyph;
+the resulting backslashes are thus not interpreted as escape
+.I characters
+when they are reread as the diversion is output.
 .
-.RI AT&T\~ troff
-would interpret them as escape characters when they were reread and
-would end up printing one
+AT&T
+.I troff \" AT&T
+.I would
+interpret them as escape characters when rereading them and end up
+printing one
 .RB \[lq] \[rs] \[rq].
 .
 .



reply via email to

[Prev in Thread] Current Thread [Next in Thread]