bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23019: parse-partial-sexp doesn't output the full state needed for i


From: Alan Mackenzie
Subject: bug#23019: parse-partial-sexp doesn't output the full state needed for its continuance.
Date: Fri, 18 Mar 2016 18:25:47 +0000
User-agent: Mutt/1.5.24 (2015-08-30)

Hello, Stefan.

On Fri, Mar 18, 2016 at 12:23:02PM -0400, Stefan Monnier wrote:

> >> - change element 10 so it's nil if the last char was an "end of
> >> something".  Another way to look at it, is that the element 10 should
> >> only be non-nil if the "next lexeme" might start on that
> >> previous character.

> > I've tried this, and it's somewhat ugly.  Setting the "previous_syntax"
> > to nil is also needed for the asterisk in "/*".  The nil would appear to
> > mean "the syntactic value of the last character has already been used
> > up".  So the "previous_syntax" is nil in the most interesting cases.  It
> > also feels somewhat ad-hoc.

> > How about this idea: element 10 will record the syntax of the previous
> > character ONLY when it is potentially the first character of a two
> > character comment delimiter, otherwise it'll be nil.  At least that's
> > being honest about what the thing's being used for.

> IIUC the only difference between what I (think I) suggested and what
> you're proposing is that you want to return nil for the "prev is
> backslash" whereas I was suggesting to return non-nil in that case.
> [ AFAIK the only two-char elements we handle so far as the comment
> delimiters and the backslash escapes.  ]

We also have Scharquote, which scan_sexps_forward handles identically to
Sescape.

> Do I understand this right?

Yes, but I've no strong feelings on the matter.

> > It would appear to be, yes.  We really can't get rid of element 5,
> > though, because there will surely be code out there that uses it.  But
> > if I change element 10 as outlined above, element 5 will no longer be
> > redundant.

> I'd even be tempted to re-use element 5, although it might
> conceivably break some code out there.

I have bad feelings about that.  Is it really worth the risk, just to
save one cons cell on a list that not that many instances of exist at
any time?

> But even if we don't re-use element 5, I would actually much prefer to
> render element 5 redundant.

OK.  Here's an updated patch which does just that.  Comments would be
welcome.

>         Stefan


Amend parse-partial-sexp correctly to handle two character comment delimiters

Do this by adding a new field to the parser state: the syntax of the last
character scanned, should that be the first char of a (potential) two char
construct, nil otherwise.
This should make the parser state complete.
Also document element 9 of the parser state.  Also refactor the code a bit.

* src/syntax.c (struct lisp_parse_state): Add a new field.
(SYNTAX_FLAGS_COMSTARTEND_FIRST): New function.
(internalize_parse_state): New function, extracted from scan_sexps_forward.
(back_comment): Call internalize_parse_state.
(forw_comment): Return the syntax of the last character scanned to the caller.
(Fforward_comment, scan_lists): New dummy variables, passed to forw_comment.
(scan_sexps_forward): Remove a redundant state parameter.  Access all `state'
information via the address parameter `state'.  Remove the code which converts
from external to internal form of `state'.  Access buffer contents only from
`from' onwards.  Reformulate code at the top of the main loop correctly to
recognize comment openers when starting in the middle of one.  Call
forw_comment with extra argument (for return of final syntax value).
(Fparse_partial_sexp): Document elements 9, 10 of the parser state in the
doc string.  Clarify the doc string in general.  Call
internalize_parse_state.  Take account of the new elements when consing up the
output parser state.

* doc/lispref/syntax.texi: (Parser State): Document element 9 and the new
element 10.  Minor wording corrections (remove reference to "trivial cases").
(Low Level Parsing): Minor corrections.




diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
index d5a7eba..f81c164 100644
--- a/doc/lispref/syntax.texi
+++ b/doc/lispref/syntax.texi
@@ -791,10 +791,10 @@ Parser State
 @subsection Parser State
 @cindex parser state
 
-  A @dfn{parser state} is a list of ten elements describing the state
-of the syntactic parser, after it parses the text between a specified
-starting point and a specified end point in the buffer.  Parsing
-functions such as @code{syntax-ppss}
+  A @dfn{parser state} is a list of (currently) eleven elements
+describing the state of the syntactic parser, after it parses the text
+between a specified starting point and a specified end point in the
+buffer.  Parsing functions such as @code{syntax-ppss}
 @ifnottex
 (@pxref{Position Parse})
 @end ifnottex
@@ -851,15 +851,20 @@ Parser State
 this element is @code{nil}.
 
 @item
-Internal data for continuing the parsing.  The meaning of this
-data is subject to change; it is used if you pass this list
-as the @var{state} argument to another call.
+The list of the positions of the currently open parentheses, starting
+with the outermost.
+
+@item
+When the last buffer position scanned was the (potential) first
+character of a two character construct (comment delimiter or
+escaped/char-quoted character pair), the @var{syntax-code}
+(@pxref{Syntax Table Internals}) of that position.  Otherwise
+@code{nil}.
 @end enumerate
 
   Elements 1, 2, and 6 are ignored in a state which you pass as an
-argument to continue parsing, and elements 8 and 9 are used only in
-trivial cases.  Those elements are mainly used internally by the
-parser code.
+argument to continue parsing.  Elements 9 and 10 are mainly used
+internally by the parser code.
 
   One additional piece of useful information is available from a
 parser state using this function:
@@ -898,11 +903,11 @@ Low-Level Parsing
 
 If the fourth argument @var{stop-before} is non-@code{nil}, parsing
 stops when it comes to any character that starts a sexp.  If
-@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the
-start of an unnested comment.  If @var{stop-comment} is the symbol
+@var{stop-comment} is non-@code{nil}, parsing stops after the start of
+an unnested comment.  If @var{stop-comment} is the symbol
 @code{syntax-table}, parsing stops after the start of an unnested
-comment or a string, or the end of an unnested comment or a string,
-whichever comes first.
+comment or a string, or after the end of an unnested comment or a
+string, whichever comes first.
 
 If @var{state} is @code{nil}, @var{start} is assumed to be at the top
 level of parenthesis structure, such as the beginning of a function
diff --git a/src/syntax.c b/src/syntax.c
index 249d0d5..e6a1942 100644
--- a/src/syntax.c
+++ b/src/syntax.c
@@ -81,6 +81,11 @@ SYNTAX_FLAGS_COMEND_SECOND (int flags)
   return (flags >> 19) & 1;
 }
 static bool
+SYNTAX_FLAGS_COMSTARTEND_FIRST (int flags)
+{
+  return (flags & 0x50000) != 0;
+}
+static bool
 SYNTAX_FLAGS_PREFIX (int flags)
 {
   return (flags >> 20) & 1;
@@ -153,6 +158,10 @@ struct lisp_parse_state
     ptrdiff_t comstr_start;  /* Position of last comment/string starter.  */
     Lisp_Object levelstarts; /* Char numbers of starts-of-expression
                                of levels (starting from outermost).  */
+    int prev_syntax; /* Syntax of previous position scanned, when
+                        that position (potentially) holds the first char
+                        of a 2-char construct, i.e. comment delimiter
+                        or Sescape, etc.  Smax otherwise. */
   };
 
 /* These variables are a cache for finding the start of a defun.
@@ -176,7 +185,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, 
Lisp_Object);
 static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
 static void scan_sexps_forward (struct lisp_parse_state *,
                                 ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
-                                bool, Lisp_Object, int);
+                                bool, int);
+static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *);
 static bool in_classes (int, Lisp_Object);
 static void parse_sexp_propertize (ptrdiff_t charpos);
 
@@ -911,10 +921,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, 
ptrdiff_t stop,
        }
       do
        {
+          internalize_parse_state (Qnil, &state);
          scan_sexps_forward (&state,
                              defun_start, defun_start_byte,
                              comment_end, TYPE_MINIMUM (EMACS_INT),
-                             0, Qnil, 0);
+                             0, 0);
          defun_start = comment_end;
          if (!adjusted)
            {
@@ -2314,7 +2325,9 @@ in_classes (int c, Lisp_Object iso_classes)
    into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR.
    Else, return false and store the charpos STOP into *CHARPOS_PTR, the
    corresponding bytepos into *BYTEPOS_PTR and the current nesting
-   (as defined for state.incomment) in *INCOMMENT_PTR.
+   (as defined for state->incomment) in *INCOMMENT_PTR.  The
+   SYNTAX_WITH_FLAGS of the last character scanned in the comment is
+   stored into *last_syntax_ptr.
 
    The comment end is the last character of the comment rather than the
    character just after the comment.
@@ -2326,7 +2339,7 @@ static bool
 forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
              EMACS_INT nesting, int style, int prev_syntax,
              ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr,
-             EMACS_INT *incomment_ptr)
+             EMACS_INT *incomment_ptr, int *last_syntax_ptr)
 {
   register int c, c1;
   register enum syntaxcode code;
@@ -2346,6 +2359,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
ptrdiff_t stop,
          *incomment_ptr = nesting;
          *charpos_ptr = from;
          *bytepos_ptr = from_byte;
+          *last_syntax_ptr = syntax;
          return 0;
        }
       c = FETCH_CHAR_AS_MULTIBYTE (from_byte);
@@ -2415,6 +2429,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
ptrdiff_t stop,
     }
   *charpos_ptr = from;
   *bytepos_ptr = from_byte;
+  *last_syntax_ptr = syntax;
   return 1;
 }
 
@@ -2436,6 +2451,7 @@ between them, return t; otherwise return nil.  */)
   EMACS_INT count1;
   ptrdiff_t out_charpos, out_bytepos;
   EMACS_INT dummy;
+  int dummy2;
 
   CHECK_NUMBER (count);
   count1 = XINT (count);
@@ -2499,7 +2515,7 @@ between them, return t; otherwise return nil.  */)
        }
       /* We're at the start of a comment.  */
       found = forw_comment (from, from_byte, stop, comnested, comstyle, 0,
-                           &out_charpos, &out_bytepos, &dummy);
+                           &out_charpos, &out_bytepos, &dummy, &dummy2);
       from = out_charpos; from_byte = out_bytepos;
       if (!found)
        {
@@ -2659,6 +2675,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT 
depth, bool sexpflag)
   ptrdiff_t from_byte;
   ptrdiff_t out_bytepos, out_charpos;
   EMACS_INT dummy;
+  int dummy2;
   bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol;
 
   if (depth > 0) min_depth = 0;
@@ -2755,7 +2772,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT 
depth, bool sexpflag)
              UPDATE_SYNTAX_TABLE_FORWARD (from);
              found = forw_comment (from, from_byte, stop,
                                    comnested, comstyle, 0,
-                                   &out_charpos, &out_bytepos, &dummy);
+                                   &out_charpos, &out_bytepos, &dummy,
+                                    &dummy2);
              from = out_charpos, from_byte = out_bytepos;
              if (!found)
                {
@@ -3119,7 +3137,7 @@ the prefix syntax flag (p).  */)
 }
 
 /* Parse forward from FROM / FROM_BYTE to END,
-   assuming that FROM has state OLDSTATE (nil means FROM is start of function),
+   assuming that FROM has state STATE,
    and return a description of the state of the parse at END.
    If STOPBEFORE, stop at the start of an atom.
    If COMMENTSTOP is 1, stop at the start of a comment.
@@ -3127,12 +3145,11 @@ the prefix syntax flag (p).  */)
    after the beginning of a string, or after the end of a string.  */
 
 static void
-scan_sexps_forward (struct lisp_parse_state *stateptr,
+scan_sexps_forward (struct lisp_parse_state *state,
                    ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end,
                    EMACS_INT targetdepth, bool stopbefore,
-                   Lisp_Object oldstate, int commentstop)
+                   int commentstop)
 {
-  struct lisp_parse_state state;
   enum syntaxcode code;
   int c1;
   bool comnested;
@@ -3148,7 +3165,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
   Lisp_Object tem;
   ptrdiff_t prev_from;         /* Keep one character before FROM.  */
   ptrdiff_t prev_from_byte;
-  int prev_from_syntax;
+  int prev_from_syntax, prev_prev_from_syntax;
   bool boundary_stop = commentstop == -1;
   bool nofence;
   bool found;
@@ -3165,6 +3182,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
 do { prev_from = from;                         \
      prev_from_byte = from_byte;               \
      temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte);  \
+     prev_prev_from_syntax = prev_from_syntax;  \
      prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \
      INC_BOTH (from, from_byte);               \
      if (from < end)                           \
@@ -3174,88 +3192,38 @@ do { prev_from = from;                          \
   immediate_quit = 1;
   QUIT;
 
-  if (NILP (oldstate))
-    {
-      depth = 0;
-      state.instring = -1;
-      state.incomment = 0;
-      state.comstyle = 0;      /* comment style a by default.  */
-      state.comstr_start = -1; /* no comment/string seen.  */
-    }
-  else
-    {
-      tem = Fcar (oldstate);
-      if (!NILP (tem))
-       depth = XINT (tem);
-      else
-       depth = 0;
-
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      /* Check whether we are inside string_fence-style string: */
-      state.instring = (!NILP (tem)
-                       ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
-                       : -1);
+  depth = state->depth;
+  start_quoted = state->quoted;
+  prev_prev_from_syntax = Smax;
+  prev_from_syntax = state->prev_syntax;
 
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.incomment = (!NILP (tem)
-                        ? (INTEGERP (tem) ? XINT (tem) : -1)
-                        : 0);
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      start_quoted = !NILP (tem);
-
-      /* if the eighth element of the list is nil, we are in comment
-        style a.  If it is non-nil, we are in comment style b */
-      oldstate = Fcdr (oldstate);
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.comstyle = (NILP (tem)
-                       ? 0
-                       : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
-                          ? XINT (tem)
-                          : ST_COMMENT_STYLE));
-
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      state.comstr_start =
-       RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
-      oldstate = Fcdr (oldstate);
-      tem = Fcar (oldstate);
-      while (!NILP (tem))              /* >= second enclosing sexps.  */
-       {
-         Lisp_Object temhd = Fcar (tem);
-         if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
-           curlevel->last = XINT (temhd);
-         if (++curlevel == endlevel)
-           curlevel--; /* error ("Nesting too deep for parser"); */
-         curlevel->prev = -1;
-         curlevel->last = -1;
-         tem = Fcdr (tem);
-       }
+  tem = state->levelstarts;
+  while (!NILP (tem))          /* >= second enclosing sexps.  */
+    {
+      Lisp_Object temhd = Fcar (tem);
+      if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
+        curlevel->last = XINT (temhd);
+      if (++curlevel == endlevel)
+        curlevel--; /* error ("Nesting too deep for parser"); */
+      curlevel->prev = -1;
+      curlevel->last = -1;
+      tem = Fcdr (tem);
     }
-  state.quoted = 0;
-  mindepth = depth;
-
   curlevel->prev = -1;
   curlevel->last = -1;
 
-  SETUP_SYNTAX_TABLE (prev_from, 1);
-  temp = FETCH_CHAR (prev_from_byte);
-  prev_from_syntax = SYNTAX_WITH_FLAGS (temp);
-  UPDATE_SYNTAX_TABLE_FORWARD (from);
+  state->quoted = 0;
+  mindepth = depth;
+
+  SETUP_SYNTAX_TABLE (from, 1);
 
   /* Enter the loop at a place appropriate for initial state.  */
 
-  if (state.incomment)
+  if (state->incomment)
     goto startincomment;
-  if (state.instring >= 0)
+  if (state->instring >= 0)
     {
-      nofence = state.instring != ST_STRING_STYLE;
+      nofence = state->instring != ST_STRING_STYLE;
       if (start_quoted)
        goto startquotedinstring;
       goto startinstring;
@@ -3266,11 +3234,8 @@ do { prev_from = from;                           \
   while (from < end)
     {
       int syntax;
-      INC_FROM;
-      code = prev_from_syntax & 0xff;
 
-      if (from < end
-         && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
+      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
          && (c1 = FETCH_CHAR (from_byte),
              syntax = SYNTAX_WITH_FLAGS (c1),
              SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
@@ -3280,32 +3245,39 @@ do { prev_from = from;                          \
          /* Record the comment style we have entered so that only
             the comment-end sequence of the same style actually
             terminates the comment section.  */
-         state.comstyle
+         state->comstyle
            = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
          comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
                       | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
-         state.incomment = comnested ? 1 : -1;
-         state.comstr_start = prev_from;
+         state->incomment = comnested ? 1 : -1;
+         state->comstr_start = prev_from;
          INC_FROM;
+          prev_from_syntax = Smax; /* the syntax has already been
+                                      "used up". */
          code = Scomment;
        }
-      else if (code == Scomment_fence)
-       {
-         /* Record the comment style we have entered so that only
-            the comment-end sequence of the same style actually
-            terminates the comment section.  */
-         state.comstyle = ST_COMMENT_STYLE;
-         state.incomment = -1;
-         state.comstr_start = prev_from;
-         code = Scomment;
-       }
-      else if (code == Scomment)
-       {
-         state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
-         state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
-                            1 : -1);
-         state.comstr_start = prev_from;
-       }
+      else
+        {
+          INC_FROM;
+          code = prev_from_syntax & 0xff;
+          if (code == Scomment_fence)
+            {
+              /* Record the comment style we have entered so that only
+                 the comment-end sequence of the same style actually
+                 terminates the comment section.  */
+              state->comstyle = ST_COMMENT_STYLE;
+              state->incomment = -1;
+              state->comstr_start = prev_from;
+              code = Scomment;
+            }
+          else if (code == Scomment)
+            {
+              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 
0);
+              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED 
(prev_from_syntax) ?
+                                 1 : -1);
+              state->comstr_start = prev_from;
+            }
+        }
 
       if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
        continue;
@@ -3350,25 +3322,28 @@ do { prev_from = from;                          \
 
        case Scomment_fence: /* Can't happen because it's handled above.  */
        case Scomment:
-         if (commentstop || boundary_stop) goto done;
+          if (commentstop || boundary_stop) goto done;
        startincomment:
          /* The (from == BEGV) test was to enter the loop in the middle so
             that we find a 2-char comment ender even if we start in the
             middle of it.  We don't want to do that if we're just at the
             beginning of the comment (think of (*) ... (*)).  */
          found = forw_comment (from, from_byte, end,
-                               state.incomment, state.comstyle,
-                               (from == BEGV || from < state.comstr_start + 3)
-                               ? 0 : prev_from_syntax,
-                               &out_charpos, &out_bytepos, &state.incomment);
+                               state->incomment, state->comstyle,
+                               from == BEGV ? 0 : prev_from_syntax,
+                               &out_charpos, &out_bytepos, &state->incomment,
+                                &prev_from_syntax);
          from = out_charpos; from_byte = out_bytepos;
-         /* Beware!  prev_from and friends are invalid now.
-            Luckily, the `done' doesn't use them and the INC_FROM
-            sets them to a sane value without looking at them. */
+         /* Beware!  prev_from and friends (except prev_from_syntax)
+            are invalid now.  Luckily, the `done' doesn't use them
+            and the INC_FROM sets them to a sane value without
+            looking at them. */
          if (!found) goto done;
          INC_FROM;
-         state.incomment = 0;
-         state.comstyle = 0;   /* reset the comment style */
+         state->incomment = 0;
+         state->comstyle = 0;  /* reset the comment style */
+          prev_from_syntax = Smax; /* Ensure "*|*" can't open a spurious new
+                                      comment. */
          if (boundary_stop) goto done;
          break;
 
@@ -3396,16 +3371,16 @@ do { prev_from = from;                          \
 
        case Sstring:
        case Sstring_fence:
-         state.comstr_start = from - 1;
+         state->comstr_start = from - 1;
          if (stopbefore) goto stop;  /* this arg means stop at sexp start */
          curlevel->last = prev_from;
-         state.instring = (code == Sstring
+         state->instring = (code == Sstring
                            ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte))
                            : ST_STRING_STYLE);
          if (boundary_stop) goto done;
        startinstring:
          {
-           nofence = state.instring != ST_STRING_STYLE;
+           nofence = state->instring != ST_STRING_STYLE;
 
            while (1)
              {
@@ -3419,7 +3394,7 @@ do { prev_from = from;                            \
                /* Check C_CODE here so that if the char has
                   a syntax-table property which says it is NOT
                   a string character, it does not end the string.  */
-               if (nofence && c == state.instring && c_code == Sstring)
+               if (nofence && c == state->instring && c_code == Sstring)
                  break;
 
                switch (c_code)
@@ -3442,7 +3417,7 @@ do { prev_from = from;                            \
              }
          }
        string_end:
-         state.instring = -1;
+         state->instring = -1;
          curlevel->prev = curlevel->last;
          INC_FROM;
          if (boundary_stop) goto done;
@@ -3461,25 +3436,96 @@ do { prev_from = from;                          \
  stop:   /* Here if stopping before start of sexp. */
   from = prev_from;    /* We have just fetched the char that starts it; */
   from_byte = prev_from_byte;
+  prev_from_syntax = prev_prev_from_syntax;
   goto done; /* but return the position before it. */
 
  endquoted:
-  state.quoted = 1;
+  state->quoted = 1;
  done:
-  state.depth = depth;
-  state.mindepth = mindepth;
-  state.thislevelstart = curlevel->prev;
-  state.prevlevelstart
+  state->depth = depth;
+  state->mindepth = mindepth;
+  state->thislevelstart = curlevel->prev;
+  state->prevlevelstart
     = (curlevel == levelstart) ? -1 : (curlevel - 1)->last;
-  state.location = from;
-  state.location_byte = from_byte;
-  state.levelstarts = Qnil;
+  state->location = from;
+  state->location_byte = from_byte;
+  state->levelstarts = Qnil;
   while (curlevel > levelstart)
-    state.levelstarts = Fcons (make_number ((--curlevel)->last),
-                              state.levelstarts);
+    state->levelstarts = Fcons (make_number ((--curlevel)->last),
+                                state->levelstarts);
+  state->prev_syntax = (SYNTAX_FLAGS_COMSTARTEND_FIRST (prev_from_syntax)
+                        || state->quoted) ? prev_from_syntax : Smax;
   immediate_quit = 0;
+}
+
+/* Convert a (lisp) parse state to the internal form used in
+   scan_sexps_forward.  */
+static void
+internalize_parse_state (Lisp_Object external, struct lisp_parse_state *state)
+{
+  Lisp_Object tem;
+
+  if (NILP (external))
+    {
+      state->depth = 0;
+      state->instring = -1;
+      state->incomment = 0;
+      state->quoted = 0;
+      state->comstyle = 0;     /* comment style a by default.  */
+      state->comstr_start = -1;        /* no comment/string seen.  */
+      state->levelstarts = Qnil;
+      state->prev_syntax = Smax;
+    }
+  else
+    {
+      tem = Fcar (external);
+      if (!NILP (tem))
+       state->depth = XINT (tem);
+      else
+       state->depth = 0;
+
+      external = Fcdr (external);
+      external = Fcdr (external);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      /* Check whether we are inside string_fence-style string: */
+      state->instring = (!NILP (tem)
+                         ? (CHARACTERP (tem) ? XFASTINT (tem) : 
ST_STRING_STYLE)
+                         : -1);
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->incomment = (!NILP (tem)
+                          ? (INTEGERP (tem) ? XINT (tem) : -1)
+                          : 0);
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->quoted = !NILP (tem);
 
-  *stateptr = state;
+      /* if the eighth element of the list is nil, we are in comment
+        style a.  If it is non-nil, we are in comment style b */
+      external = Fcdr (external);
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->comstyle = (NILP (tem)
+                         ? 0
+                         : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
+                            ? XINT (tem)
+                            : ST_COMMENT_STYLE));
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->comstr_start =
+       RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->levelstarts = tem;
+
+      external = Fcdr (external);
+      tem = Fcar (external);
+      state->prev_syntax = NILP (tem) ? Smax : XINT (tem);
+    }
 }
 
 DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 0,
@@ -3488,6 +3534,7 @@ Parsing stops at TO or when certain criteria are met;
  point is set to where parsing stops.
 If fifth arg OLDSTATE is omitted or nil,
  parsing assumes that FROM is the beginning of a function.
+
 Value is a list of elements describing final state of parsing:
  0. depth in parens.
  1. character address of start of innermost containing list; nil if none.
@@ -3501,16 +3548,22 @@ Value is a list of elements describing final state of 
parsing:
  6. the minimum paren-depth encountered during this scan.
  7. style of comment, if any.
  8. character address of start of comment or string; nil if not in one.
- 9. Intermediate data for continuation of parsing (subject to change).
+ 9. List of positions of currently open parens, outermost first.
+10. When the last position scanned holds the first character of a
+    (potential) two character construct, the syntax of that position,
+    otherwise nil.  That construct can be a two character comment
+    delimiter or an Escaped or Char-quoted character.
+11..... Possible further internal information used by `parse-partial-sexp'.
+
 If third arg TARGETDEPTH is non-nil, parsing stops if the depth
 in parentheses becomes equal to TARGETDEPTH.
-Fourth arg STOPBEFORE non-nil means stop when come to
+Fourth arg STOPBEFORE non-nil means stop when we come to
  any character that starts a sexp.
 Fifth arg OLDSTATE is a list like what this function returns.
  It is used to initialize the state of the parse.  Elements number 1, 2, 6
  are ignored.
-Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
- If it is symbol `syntax-table', stop after the start of a comment or a
+Sixth arg COMMENTSTOP non-nil means stop after the start of a comment.
+ If it is the symbol `syntax-table', stop after the start of a comment or a
  string, or after end of a comment or a string.  */)
   (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth,
    Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop)
@@ -3527,15 +3580,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start 
of a comment.
     target = TYPE_MINIMUM (EMACS_INT); /* We won't reach this depth.  */
 
   validate_region (&from, &to);
+  internalize_parse_state (oldstate, &state);
   scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)),
                      XINT (to),
-                     target, !NILP (stopbefore), oldstate,
+                     target, !NILP (stopbefore),
                      (NILP (commentstop)
                       ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1)));
 
   SET_PT_BOTH (state.location, state.location_byte);
 
-  return Fcons (make_number (state.depth),
+  return
+    Fcons (make_number (state.depth),
           Fcons (state.prevlevelstart < 0
                  ? Qnil : make_number (state.prevlevelstart),
             Fcons (state.thislevelstart < 0
@@ -3553,11 +3608,15 @@ Sixth arg COMMENTSTOP non-nil means stop at the start 
of a comment.
                                  ? Qsyntax_table
                                  : make_number (state.comstyle))
                               : Qnil),
-                             Fcons (((state.incomment
-                                      || (state.instring >= 0))
-                                     ? make_number (state.comstr_start)
-                                     : Qnil),
-                                    Fcons (state.levelstarts, Qnil))))))))));
+                        Fcons (((state.incomment
+                                  || (state.instring >= 0))
+                                 ? make_number (state.comstr_start)
+                                 : Qnil),
+                          Fcons (state.levelstarts,
+                             Fcons (state.prev_syntax == Smax
+                                    ? Qnil
+                                    : make_number (state.prev_syntax),
+                                Qnil)))))))))));
 }
 
 void



-- 
Alan Mackenzie (Nuremberg, Germany).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]