groff-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[groff] 07/09: [grog]: Simplify parsing.


From: G. Branden Robinson
Subject: [groff] 07/09: [grog]: Simplify parsing.
Date: Thu, 1 Jul 2021 09:21:50 -0400 (EDT)

gbranden pushed a commit to branch master
in repository groff.

commit 3ed8e933f494c18755edf05ee887f1c032289aec
Author: G. Branden Robinson <g.branden.robinson@gmail.com>
AuthorDate: Thu Jul 1 21:59:06 2021 +1000

    [grog]: Simplify parsing.
    
    * src/utils/grog/grog.pl: Simplify parsing.  Dave Kemper pointed out
      that preprocessors like pic(1) use pretty unsophisticated *roff
      parsing to determine where to perform their textual replacements.  My
      enhancements to support input line continuation and cope with brace
      escapes were thus overengineered.  Remove them.
    
      - Drop scalars `is_continued_line` and `logical_line`.
    
      (do_line): Stop performing logical line concatenation and detecting
      input line continuation.  Perform operations on `line` instead of
      `logical_line`.  Stop removing brace escapes.
    
    * src/utils/grog/grog.1.man (Limitations): Update discussion.
    
    Fixes <https://savannah.gnu.org/bugs/?60862>.  Thanks, Dave!
---
 ChangeLog                 | 17 +++++++++++++++++
 src/utils/grog/grog.1.man | 40 ++++++++++++++++++++++++++++++----------
 src/utils/grog/grog.pl    | 42 +++++++++++++-----------------------------
 3 files changed, 60 insertions(+), 39 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b9252c2..3064bd5 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,20 @@
+2021-07-01  G. Branden Robinson <g.branden.robinson@gmail.com>
+
+       * src/utils/grog/grog.pl: Simplify parsing.  Dave Kemper pointed
+       out that preprocessors like pic(1) use pretty unsophisticated
+       *roff parsing to determine where to perform their textual
+       replacements.  My enhancements to support input line
+       continuation and cope with brace escapes were thus
+       overengineered.  Remove them.
+         - Drop scalars `is_continued_line` and `logical_line`.
+         (do_line): Stop performing logical line concatenation and
+         detecting input line continuation.  Perform operations on
+         `line` instead of `logical_line`.  Stop removing brace
+         escapes.
+       * src/utils/grog/grog.1.man (Limitations): Update discussion.
+
+       Fixes <https://savannah.gnu.org/bugs/?60862>.  Thanks, Dave!
+
 2021-06-30  G. Branden Robinson <g.branden.robinson@gmail.com>
 
        * src/roff/troff/reg.cpp (lookup_number_reg, alias_reg): In
diff --git a/src/utils/grog/grog.1.man b/src/utils/grog/grog.1.man
index f28b52d..78c187c 100644
--- a/src/utils/grog/grog.1.man
+++ b/src/utils/grog/grog.1.man
@@ -247,8 +247,8 @@ and no-break control characters.
 .I grog
 does not parse
 .I roff
-control structures
-(the
+input line continuation or control structures
+(brace escape sequences and the
 .RB \[lq] if \[rq],
 .RB \[lq] ie \[rq],
 and
@@ -262,18 +262,38 @@ Thus the input
 .
 .RS
 .EX
-\&.if t .PS
-\&.if t .PE
+\&.if \[rs]
+t .NH 1
+\&.if n .SH
+Introduction
 .EE
 .RE
 .
-will not,
+will conceal the use of the
+.I ms
+macros
+.B NH
+and
+.B SH
+from
+.IR grog .
+.
+Such constructions are regarded by
+.IR grog 's
+implementors as insufficiently common to cause many inference problems;
+further,
+preprocessors are typically even stricter when matching the macro calls
+they use to bracket the regions of an input file they textually replace.
+.
+.IR pic ,
 for example,
-cause
-.I grog
-to infer use of the
-.IR \%@g@pic (1)
-preprocessor.
+requires
+.B PS
+and
+.B PE
+calls to immediately follow the default control character at the
+beginning of a line,
+with no intervening spaces or tabs.
 .
 .
 .P
diff --git a/src/utils/grog/grog.pl b/src/utils/grog/grog.pl
index 5f359c2..18b1bd9 100644
--- a/src/utils/grog/grog.pl
+++ b/src/utils/grog/grog.pl
@@ -185,8 +185,6 @@ my $inside_tbl_table = 0;
 my $man_score = 0;
 my $ms_score = 0;
 
-my $is_continued_line = 0;
-my $logical_line = '';
 my $had_inference_problem = 0;
 my $had_processing_problem = 0;
 my $have_any_valid_arguments = 0;
@@ -329,58 +327,43 @@ sub do_line {
 
   my $line = shift;
 
-  if ($is_continued_line) {
-    $logical_line .= $line;
-  } else {
-    $logical_line = $line;
-  }
-
-  if ($logical_line =~ s/\\$//) {
-    $is_continued_line = 1;
-    return;
-  } else {
-    $is_continued_line = 0;
-  }
-
   # Check for a Perl Pod::Man comment.
   #
   # An alternative to this kludge is noted below: if a "standard" macro
   # is redefined, we could delete it from the relevant lists and
   # hashes.)
-  if ($logical_line =~ /\\\" Automatically generated by Pod::Man/) {
+  if ($line =~ /\\\" Automatically generated by Pod::Man/) {
     $man_score += 100;
   }
 
   # Strip comments.
-  $logical_line =~ s/\\".*//;
-  $logical_line =~ s/\\#.*//;
+  $line =~ s/\\".*//;
+  $line =~ s/\\#.*//;
 
-  return unless ($logical_line =~ /^[.']/);    # Ignore text lines.
+  return unless ($line =~ /^[.']/);    # Ignore text lines.
 
   # Normalize control lines; convert no-break control character to the
   # regular one and remove unnecesssary whitespace.
-  $logical_line =~ s/^['.]\s*/./;
-  $logical_line =~ s/\s+$//;
+  $line =~ s/^['.]\s*/./;
+  $line =~ s/\s+$//;
 
-  return if ($logical_line =~ /^\.$/);         # Ignore empty request.
-  return if ($logical_line =~ /^\.\\?\.$/);    # Ignore macro def ends.
-
-  $logical_line =~ s/\\[{}]//g;                # Remove any brace escapes.
+  return if ($line =~ /^\.$/);         # Ignore empty request.
+  return if ($line =~ /^\.\\?\.$/);    # Ignore macro definition ends.
 
   # Split control line into a request or macro call and its arguments.
 
   # Handle single-letter macro names.
-  if ($logical_line =~ /^\.(\w)(\s+(.*))?$/) {
+  if ($line =~ /^\.(\w)(\s+(.*))?$/) {
     $command = $1;
     $args = $2;
   # Handle two-letter macro/request names in compatibility mode.
   } elsif ($use_compatibility_mode) {
-    $logical_line =~ /^\.(\w\w)\s*(.*)$/;
+    $line =~ /^\.(\w\w)\s*(.*)$/;
     $command = $1;
     $args = $2;
   # Handle multi-letter macro/request names in groff mode.
   } else {
-    $logical_line =~ /^\.(\w+)(\s+(.*))?$/;
+    $line =~ /^\.(\w+)(\s+(.*))?$/;
     $command = $1;
     $args = $3;
   }
@@ -399,11 +382,12 @@ sub do_line {
   # If the line calls a user-defined macro, skip it.
   return if (exists $user_macro{$command});
 
+  # Add user-defined macro names to %user_macros.
+  #
   # Macros can also be defined with .dei{,1}, ami{,1}, but supporting
   # that would be a heavy lift for the benefit of users that probably
   # don't require grog's help.  --GBR
   if ($command =~ /^(de|am)1?$/) {
-    # this line is a macro definition, add it to %user_macro
     my $name = $args;
     # Strip off any end macro.
     $name =~ s/\W*$//;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]