emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacs-diffs] master 9dcf599: Amend parse-partial-sexp correctly to


From: Stefan Monnier
Subject: Re: [Emacs-diffs] master 9dcf599: Amend parse-partial-sexp correctly to handle two character comment delimiters
Date: Sun, 20 Mar 2016 09:47:58 -0400
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux)

What was John's opinion on reusing nth 5?


        Stefan


>>>>> "Alan" == Alan Mackenzie <address@hidden> writes:

> branch: master
> commit 9dcf5998935c8aaa846d7585b81f0dcfe1935b3d
> Author: Alan Mackenzie <address@hidden>
> Commit: Alan Mackenzie <address@hidden>

>     Amend parse-partial-sexp correctly to handle two character comment 
> delimiters
    
>     Do this by adding a new field to the parser state: the syntax of the last
>     character scanned, should that be the first char of a (potential) two char
>     construct, nil otherwise.
>     This should make the parser state complete.
>     Also document element 9 of the parser state.  Also refactor the code a 
> bit.
    
>     * src/syntax.c (struct lisp_parse_state): Add a new field.
>     (SYNTAX_FLAGS_COMSTARTEND_FIRST): New function.
>     (internalize_parse_state): New function, extracted from 
> scan_sexps_forward.
>     (back_comment): Call internalize_parse_state.
>     (forw_comment): Return the syntax of the last character scanned to the 
> caller
>     when that character might be the first of a two character construct.
>     (Fforward_comment, scan_lists): New dummy variables, passed to 
> forw_comment.
>     (scan_sexps_forward): Remove a redundant state parameter.  Access all 
> `state'
>     information via the address parameter `state'.  Remove the code which 
> converts
>     from external to internal form of `state'.  Access buffer contents only 
> from
>     `from' onwards.  Reformulate code at the top of the main loop correctly to
>     recognize comment openers when starting in the middle of one.  Call
>     forw_comment with extra argument (for return of syntax value of possible 
> first
>     char of a two char construct).
>     (Fparse_partial_sexp): Document elements 9, 10 of the parser state in the
>     doc string.  Clarify the doc string in general.  Call
>     internalize_parse_state.  Take account of the new elements when consing 
> up the
>     output parser state.
    
>     * doc/lispref/syntax.texi: (Parser State): Document element 9 and the new
>     element 10.  Minor wording corrections (remove reference to "trivial
>     cases").
>     (Low Level Parsing): Minor corrections.
    
>     * etc/NEWS: Note new element 10, and documentation of element 9 of parser
>     state.
> ---
>  doc/lispref/syntax.texi |   33 +++--
>  etc/NEWS                |   12 ++
>  src/syntax.c            |  372 
> ++++++++++++++++++++++++++++-------------------
>  3 files changed, 252 insertions(+), 165 deletions(-)

> diff --git a/doc/lispref/syntax.texi b/doc/lispref/syntax.texi
> index d5a7eba..f81c164 100644
> --- a/doc/lispref/syntax.texi
> +++ b/doc/lispref/syntax.texi
> @@ -791,10 +791,10 @@ Hooks}).
>  @subsection Parser State
>  @cindex parser state
 
> -  A @dfn{parser state} is a list of ten elements describing the state
> -of the syntactic parser, after it parses the text between a specified
> -starting point and a specified end point in the buffer.  Parsing
> -functions such as @code{syntax-ppss}
> +  A @dfn{parser state} is a list of (currently) eleven elements
> +describing the state of the syntactic parser, after it parses the text
> +between a specified starting point and a specified end point in the
> +buffer.  Parsing functions such as @code{syntax-ppss}
>  @ifnottex
>  (@pxref{Position Parse})
>  @end ifnottex
> @@ -851,15 +851,20 @@ position where the string began.  When outside of 
> strings and comments,
>  this element is @code{nil}.
 
>  @item
> -Internal data for continuing the parsing.  The meaning of this
> -data is subject to change; it is used if you pass this list
> -as the @var{state} argument to another call.
> +The list of the positions of the currently open parentheses, starting
> +with the outermost.
> +
> address@hidden
> +When the last buffer position scanned was the (potential) first
> +character of a two character construct (comment delimiter or
> +escaped/char-quoted character pair), the @var{syntax-code}
> +(@pxref{Syntax Table Internals}) of that position.  Otherwise
> address@hidden
>  @end enumerate
 
>    Elements 1, 2, and 6 are ignored in a state which you pass as an
> -argument to continue parsing, and elements 8 and 9 are used only in
> -trivial cases.  Those elements are mainly used internally by the
> -parser code.
> +argument to continue parsing.  Elements 9 and 10 are mainly used
> +internally by the parser code.
 
>    One additional piece of useful information is available from a
>  parser state using this function:
> @@ -898,11 +903,11 @@ The depth starts at 0, or at whatever is given in 
> @var{state}.
 
>  If the fourth argument @var{stop-before} is address@hidden, parsing
>  stops when it comes to any character that starts a sexp.  If
> address@hidden is address@hidden, parsing stops when it comes to the
> -start of an unnested comment.  If @var{stop-comment} is the symbol
> address@hidden is address@hidden, parsing stops after the start of
> +an unnested comment.  If @var{stop-comment} is the symbol
>  @code{syntax-table}, parsing stops after the start of an unnested
> -comment or a string, or the end of an unnested comment or a string,
> -whichever comes first.
> +comment or a string, or after the end of an unnested comment or a
> +string, whichever comes first.
 
>  If @var{state} is @code{nil}, @var{start} is assumed to be at the top
>  level of parenthesis structure, such as the beginning of a function
> diff --git a/etc/NEWS b/etc/NEWS
> index d963dee..ea32153 100644
> --- a/etc/NEWS
> +++ b/etc/NEWS
> @@ -175,6 +175,18 @@ a new window when opening man pages when there's already 
> one, use
>          (inhibit-same-window . nil)
>          (mode . Man-mode))))
 
> ++++
> +** `parse-partial-sexp' state has a new element.  Element 10 is
> +non-nil when the last character scanned might be the first character
> +of a two character construct, i.e. a comment delimiter or escaped
> +character.  Its value is the syntax of that last character.
> +
> ++++
> +** `parse-partial-sexp''s state, element 9, has now been confirmed as
> +permanent and documented, and may be used by Lisp programs.  Its value
> +is a list of currently open parenthesis positions, starting with the
> +outermost parenthesis.
> +
>  
>  * Changes in Emacs 25.2 on Non-Free Operating Systems
 
> diff --git a/src/syntax.c b/src/syntax.c
> index fdcfdfc..ffe0ea5 100644
> --- a/src/syntax.c
> +++ b/src/syntax.c
> @@ -81,6 +81,11 @@ SYNTAX_FLAGS_COMEND_SECOND (int flags)
>    return (flags >> 19) & 1;
>  }
>  static bool
> +SYNTAX_FLAGS_COMSTARTEND_FIRST (int flags)
> +{
> +  return (flags & 0x50000) != 0;
> +}
> +static bool
>  SYNTAX_FLAGS_PREFIX (int flags)
>  {
>    return (flags >> 20) & 1;
> @@ -153,6 +158,10 @@ struct lisp_parse_state
>      ptrdiff_t comstr_start;  /* Position of last comment/string starter.  */
>      Lisp_Object levelstarts; /* Char numbers of starts-of-expression
>                               of levels (starting from outermost).  */
> +    int prev_syntax; /* Syntax of previous position scanned, when
> +                        that position (potentially) holds the first char
> +                        of a 2-char construct, i.e. comment delimiter
> +                        or Sescape, etc.  Smax otherwise. */
>    };
>  
>  /* These variables are a cache for finding the start of a defun.
> @@ -176,7 +185,8 @@ static Lisp_Object skip_syntaxes (bool, Lisp_Object, 
> Lisp_Object);
>  static Lisp_Object scan_lists (EMACS_INT, EMACS_INT, EMACS_INT, bool);
>  static void scan_sexps_forward (struct lisp_parse_state *,
>                                  ptrdiff_t, ptrdiff_t, ptrdiff_t, EMACS_INT,
> -                                bool, Lisp_Object, int);
> +                                bool, int);
> +static void internalize_parse_state (Lisp_Object, struct lisp_parse_state *);
>  static bool in_classes (int, Lisp_Object);
>  static void parse_sexp_propertize (ptrdiff_t charpos);
 
> @@ -911,10 +921,11 @@ back_comment (ptrdiff_t from, ptrdiff_t from_byte, 
> ptrdiff_t stop,
>       }
>        do
>       {
> +          internalize_parse_state (Qnil, &state);
>         scan_sexps_forward (&state,
>                             defun_start, defun_start_byte,
>                             comment_end, TYPE_MINIMUM (EMACS_INT),
> -                           0, Qnil, 0);
> +                           0, 0);
>         defun_start = comment_end;
>         if (!adjusted)
>           {
> @@ -2310,11 +2321,15 @@ in_classes (int c, Lisp_Object iso_classes)
>     PREV_SYNTAX is the SYNTAX_WITH_FLAGS of the previous character
>       (or 0 If the search cannot start in the middle of a two-character).
 
> -   If successful, return true and store the charpos of the comment's end
> -   into *CHARPOS_PTR and the corresponding bytepos into *BYTEPOS_PTR.
> -   Else, return false and store the charpos STOP into *CHARPOS_PTR, the
> -   corresponding bytepos into *BYTEPOS_PTR and the current nesting
> -   (as defined for state.incomment) in *INCOMMENT_PTR.
> +   If successful, return true and store the charpos of the comment's
> +   end into *CHARPOS_PTR and the corresponding bytepos into
> +   *BYTEPOS_PTR.  Else, return false and store the charpos STOP into
> +   *CHARPOS_PTR, the corresponding bytepos into *BYTEPOS_PTR and the
> +   current nesting (as defined for state->incomment) in
> +   *INCOMMENT_PTR.  Should the last character scanned in an incomplete
> +   comment be a possible first character of a two character construct,
> +   we store its SYNTAX_WITH_FLAGS into *last_syntax_ptr.  Otherwise,
> +   we store Smax into *last_syntax_ptr.
 
>     The comment end is the last character of the comment rather than the
>     character just after the comment.
> @@ -2326,7 +2341,7 @@ static bool
>  forw_comment (ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t stop,
>             EMACS_INT nesting, int style, int prev_syntax,
>             ptrdiff_t *charpos_ptr, ptrdiff_t *bytepos_ptr,
> -           EMACS_INT *incomment_ptr)
> +           EMACS_INT *incomment_ptr, int *last_syntax_ptr)
>  {
>    register int c, c1;
>    register enum syntaxcode code;
> @@ -2337,7 +2352,8 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
> ptrdiff_t stop,
>    /* Enter the loop in the middle so that we find
>       a 2-char comment ender if we start in the middle of it.  */
>    syntax = prev_syntax;
> -  if (syntax != 0) goto forw_incomment;
> +  code = syntax & 0xff;
> +  if (syntax != 0 && from < stop) goto forw_incomment;
 
>    while (1)
>      {
> @@ -2346,6 +2362,12 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
> ptrdiff_t stop,
>         *incomment_ptr = nesting;
>         *charpos_ptr = from;
>         *bytepos_ptr = from_byte;
> +          *last_syntax_ptr =
> +            (code == Sescape || code == Scharquote
> +             || SYNTAX_FLAGS_COMEND_FIRST (syntax)
> +             || (nesting > 0
> +                 && SYNTAX_FLAGS_COMSTART_FIRST (syntax)))
> +            ? syntax : Smax ;
>         return 0;
>       }
>        c = FETCH_CHAR_AS_MULTIBYTE (from_byte);
> @@ -2386,7 +2408,9 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
> ptrdiff_t stop,
>              SYNTAX_FLAGS_COMMENT_NESTED (other_syntax))
>             ? nesting > 0 : nesting < 0))
>       {
> -       if (--nesting <= 0)
> +       syntax = Smax;        /* So that "|#" (lisp) can not return
> +                                   the syntax of "#" in *last_syntax_ptr. */
> +          if (--nesting <= 0)
>           /* We have encountered a comment end of the same style
>              as the comment sequence which began this comment section.  */
>           break;
> @@ -2408,6 +2432,7 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
> ptrdiff_t stop,
>       /* We have encountered a nested comment of the same style
>          as the comment sequence which began this comment section.  */
>       {
> +          syntax = Smax; /* So that "#|#" isn't also a comment ender. */
>         INC_BOTH (from, from_byte);
>         UPDATE_SYNTAX_TABLE_FORWARD (from);
>         nesting++;
> @@ -2415,6 +2440,8 @@ forw_comment (ptrdiff_t from, ptrdiff_t from_byte, 
> ptrdiff_t stop,
>      }
>    *charpos_ptr = from;
>    *bytepos_ptr = from_byte;
> +  *last_syntax_ptr = Smax; /* Any syntactic power the last byte had is
> +                              used up. */
>    return 1;
>  }
 
> @@ -2436,6 +2463,7 @@ between them, return t; otherwise return nil.  */)
>    EMACS_INT count1;
>    ptrdiff_t out_charpos, out_bytepos;
>    EMACS_INT dummy;
> +  int dummy2;
 
>    CHECK_NUMBER (count);
>    count1 = XINT (count);
> @@ -2499,7 +2527,7 @@ between them, return t; otherwise return nil.  */)
>       }
>        /* We're at the start of a comment.  */
>        found = forw_comment (from, from_byte, stop, comnested, comstyle, 0,
> -                         &out_charpos, &out_bytepos, &dummy);
> +                         &out_charpos, &out_bytepos, &dummy, &dummy2);
>        from = out_charpos; from_byte = out_bytepos;
>        if (!found)
>       {
> @@ -2659,6 +2687,7 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT 
> depth, bool sexpflag)
>    ptrdiff_t from_byte;
>    ptrdiff_t out_bytepos, out_charpos;
>    EMACS_INT dummy;
> +  int dummy2;
>    bool multibyte_symbol_p = sexpflag && multibyte_syntax_as_symbol;
 
>    if (depth > 0) min_depth = 0;
> @@ -2755,7 +2784,8 @@ scan_lists (EMACS_INT from, EMACS_INT count, EMACS_INT 
> depth, bool sexpflag)
>             UPDATE_SYNTAX_TABLE_FORWARD (from);
>             found = forw_comment (from, from_byte, stop,
>                                   comnested, comstyle, 0,
> -                                 &out_charpos, &out_bytepos, &dummy);
> +                                 &out_charpos, &out_bytepos, &dummy,
> +                                    &dummy2);
>             from = out_charpos, from_byte = out_bytepos;
>             if (!found)
>               {
> @@ -3119,7 +3149,7 @@ the prefix syntax flag (p).  */)
>  }
>  
>  /* Parse forward from FROM / FROM_BYTE to END,
> -   assuming that FROM has state OLDSTATE (nil means FROM is start of 
> function),
> +   assuming that FROM has state STATE,
>     and return a description of the state of the parse at END.
>     If STOPBEFORE, stop at the start of an atom.
>     If COMMENTSTOP is 1, stop at the start of a comment.
> @@ -3127,12 +3157,11 @@ the prefix syntax flag (p).  */)
>     after the beginning of a string, or after the end of a string.  */
 
>  static void
> -scan_sexps_forward (struct lisp_parse_state *stateptr,
> +scan_sexps_forward (struct lisp_parse_state *state,
>                   ptrdiff_t from, ptrdiff_t from_byte, ptrdiff_t end,
>                   EMACS_INT targetdepth, bool stopbefore,
> -                 Lisp_Object oldstate, int commentstop)
> +                 int commentstop)
>  {
> -  struct lisp_parse_state state;
>    enum syntaxcode code;
>    int c1;
>    bool comnested;
> @@ -3148,7 +3177,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
>    Lisp_Object tem;
>    ptrdiff_t prev_from;               /* Keep one character before FROM.  */
>    ptrdiff_t prev_from_byte;
> -  int prev_from_syntax;
> +  int prev_from_syntax, prev_prev_from_syntax;
>    bool boundary_stop = commentstop == -1;
>    bool nofence;
>    bool found;
> @@ -3165,6 +3194,7 @@ scan_sexps_forward (struct lisp_parse_state *stateptr,
>  do { prev_from = from;                               \
>       prev_from_byte = from_byte;             \
>       temp = FETCH_CHAR_AS_MULTIBYTE (prev_from_byte);        \
> +     prev_prev_from_syntax = prev_from_syntax;  \
>       prev_from_syntax = SYNTAX_WITH_FLAGS (temp); \
>       INC_BOTH (from, from_byte);             \
>       if (from < end)                         \
> @@ -3174,88 +3204,38 @@ do { prev_from = from;                                
> \
>    immediate_quit = 1;
>    QUIT;
 
> -  if (NILP (oldstate))
> -    {
> -      depth = 0;
> -      state.instring = -1;
> -      state.incomment = 0;
> -      state.comstyle = 0;    /* comment style a by default.  */
> -      state.comstr_start = -1;       /* no comment/string seen.  */
> -    }
> -  else
> -    {
> -      tem = Fcar (oldstate);
> -      if (!NILP (tem))
> -     depth = XINT (tem);
> -      else
> -     depth = 0;
> -
> -      oldstate = Fcdr (oldstate);
> -      oldstate = Fcdr (oldstate);
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      /* Check whether we are inside string_fence-style string: */
> -      state.instring = (!NILP (tem)
> -                     ? (CHARACTERP (tem) ? XFASTINT (tem) : ST_STRING_STYLE)
> -                     : -1);
> -
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      state.incomment = (!NILP (tem)
> -                      ? (INTEGERP (tem) ? XINT (tem) : -1)
> -                      : 0);
> -
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      start_quoted = !NILP (tem);
> +  depth = state->depth;
> +  start_quoted = state->quoted;
> +  prev_prev_from_syntax = Smax;
> +  prev_from_syntax = state->prev_syntax;
 
> -      /* if the eighth element of the list is nil, we are in comment
> -      style a.  If it is non-nil, we are in comment style b */
> -      oldstate = Fcdr (oldstate);
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      state.comstyle = (NILP (tem)
> -                     ? 0
> -                     : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
> -                        ? XINT (tem)
> -                        : ST_COMMENT_STYLE));
> -
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      state.comstr_start =
> -     RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
> -      oldstate = Fcdr (oldstate);
> -      tem = Fcar (oldstate);
> -      while (!NILP (tem))            /* >= second enclosing sexps.  */
> -     {
> -       Lisp_Object temhd = Fcar (tem);
> -       if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
> -         curlevel->last = XINT (temhd);
> -       if (++curlevel == endlevel)
> -         curlevel--; /* error ("Nesting too deep for parser"); */
> -       curlevel->prev = -1;
> -       curlevel->last = -1;
> -       tem = Fcdr (tem);
> -     }
> +  tem = state->levelstarts;
> +  while (!NILP (tem))                /* >= second enclosing sexps.  */
> +    {
> +      Lisp_Object temhd = Fcar (tem);
> +      if (RANGED_INTEGERP (PTRDIFF_MIN, temhd, PTRDIFF_MAX))
> +        curlevel->last = XINT (temhd);
> +      if (++curlevel == endlevel)
> +        curlevel--; /* error ("Nesting too deep for parser"); */
> +      curlevel->prev = -1;
> +      curlevel->last = -1;
> +      tem = Fcdr (tem);
>      }
> -  state.quoted = 0;
> -  mindepth = depth;
> -
curlevel-> prev = -1;
curlevel-> last = -1;
 
> -  SETUP_SYNTAX_TABLE (prev_from, 1);
> -  temp = FETCH_CHAR (prev_from_byte);
> -  prev_from_syntax = SYNTAX_WITH_FLAGS (temp);
> -  UPDATE_SYNTAX_TABLE_FORWARD (from);
> +  state->quoted = 0;
> +  mindepth = depth;
> +
> +  SETUP_SYNTAX_TABLE (from, 1);
 
>    /* Enter the loop at a place appropriate for initial state.  */
 
> -  if (state.incomment)
> +  if (state->incomment)
>      goto startincomment;
> -  if (state.instring >= 0)
> +  if (state->instring >= 0)
>      {
> -      nofence = state.instring != ST_STRING_STYLE;
> +      nofence = state->instring != ST_STRING_STYLE;
>        if (start_quoted)
>       goto startquotedinstring;
>        goto startinstring;
> @@ -3266,11 +3246,8 @@ do { prev_from = from;                         \
>    while (from < end)
>      {
>        int syntax;
> -      INC_FROM;
> -      code = prev_from_syntax & 0xff;
 
> -      if (from < end
> -       && SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
> +      if (SYNTAX_FLAGS_COMSTART_FIRST (prev_from_syntax)
>         && (c1 = FETCH_CHAR (from_byte),
>             syntax = SYNTAX_WITH_FLAGS (c1),
>             SYNTAX_FLAGS_COMSTART_SECOND (syntax)))
> @@ -3280,32 +3257,39 @@ do { prev_from = from;                                
> \
>         /* Record the comment style we have entered so that only
>            the comment-end sequence of the same style actually
>            terminates the comment section.  */
> -       state.comstyle
> +       state->comstyle
>           = SYNTAX_FLAGS_COMMENT_STYLE (syntax, prev_from_syntax);
>         comnested = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax)
>                      | SYNTAX_FLAGS_COMMENT_NESTED (syntax));
> -       state.incomment = comnested ? 1 : -1;
> -       state.comstr_start = prev_from;
> +       state->incomment = comnested ? 1 : -1;
> +       state->comstr_start = prev_from;
>         INC_FROM;
> +          prev_from_syntax = Smax; /* the syntax has already been
> +                                      "used up". */
>         code = Scomment;
>       }
> -      else if (code == Scomment_fence)
> -     {
> -       /* Record the comment style we have entered so that only
> -          the comment-end sequence of the same style actually
> -          terminates the comment section.  */
> -       state.comstyle = ST_COMMENT_STYLE;
> -       state.incomment = -1;
> -       state.comstr_start = prev_from;
> -       code = Scomment;
> -     }
> -      else if (code == Scomment)
> -     {
> -       state.comstyle = SYNTAX_FLAGS_COMMENT_STYLE (prev_from_syntax, 0);
> -       state.incomment = (SYNTAX_FLAGS_COMMENT_NESTED (prev_from_syntax) ?
> -                          1 : -1);
> -       state.comstr_start = prev_from;
> -     }
> +      else
> +        {
> +          INC_FROM;
> +          code = prev_from_syntax & 0xff;
> +          if (code == Scomment_fence)
> +            {
> +              /* Record the comment style we have entered so that only
> +                 the comment-end sequence of the same style actually
> +                 terminates the comment section.  */
> +              state->comstyle = ST_COMMENT_STYLE;
> +              state->incomment = -1;
> +              state->comstr_start = prev_from;
> +              code = Scomment;
> +            }
> +          else if (code == Scomment)
> +            {
> +              state->comstyle = SYNTAX_FLAGS_COMMENT_STYLE 
> (prev_from_syntax, 0);
> +              state->incomment = (SYNTAX_FLAGS_COMMENT_NESTED 
> (prev_from_syntax) ?
> +                                 1 : -1);
> +              state->comstr_start = prev_from;
> +            }
> +        }
 
>        if (SYNTAX_FLAGS_PREFIX (prev_from_syntax))
>       continue;
> @@ -3350,26 +3334,28 @@ do { prev_from = from;                                
> \
 
>       case Scomment_fence: /* Can't happen because it's handled above.  */
>       case Scomment:
> -       if (commentstop || boundary_stop) goto done;
> +          if (commentstop || boundary_stop) goto done;
>       startincomment:
>         /* The (from == BEGV) test was to enter the loop in the middle so
>            that we find a 2-char comment ender even if we start in the
>            middle of it.  We don't want to do that if we're just at the
>            beginning of the comment (think of (*) ... (*)).  */
>         found = forw_comment (from, from_byte, end,
> -                             state.incomment, state.comstyle,
> -                             (from == BEGV || from < state.comstr_start + 3)
> -                             ? 0 : prev_from_syntax,
> -                             &out_charpos, &out_bytepos, &state.incomment);
> +                             state->incomment, state->comstyle,
> +                             from == BEGV ? 0 : prev_from_syntax,
> +                             &out_charpos, &out_bytepos, &state->incomment,
> +                                &prev_from_syntax);
>         from = out_charpos; from_byte = out_bytepos;
> -       /* Beware!  prev_from and friends are invalid now.
> -          Luckily, the `done' doesn't use them and the INC_FROM
> -          sets them to a sane value without looking at them. */
> +       /* Beware!  prev_from and friends (except prev_from_syntax)
> +          are invalid now.  Luckily, the `done' doesn't use them
> +          and the INC_FROM sets them to a sane value without
> +          looking at them. */
>         if (!found) goto done;
>         INC_FROM;
> -       state.incomment = 0;
> -       state.comstyle = 0;   /* reset the comment style */
> -       if (boundary_stop) goto done;
> +       state->incomment = 0;
> +       state->comstyle = 0;  /* reset the comment style */
> +       prev_from_syntax = Smax; /* For the comment closer */
> +          if (boundary_stop) goto done;
>         break;
 
>       case Sopen:
> @@ -3396,16 +3382,16 @@ do { prev_from = from;                                
> \
 
>       case Sstring:
>       case Sstring_fence:
> -       state.comstr_start = from - 1;
> +       state->comstr_start = from - 1;
>         if (stopbefore) goto stop;  /* this arg means stop at sexp start */
curlevel-> last = prev_from;
> -       state.instring = (code == Sstring
> +       state->instring = (code == Sstring
>                           ? (FETCH_CHAR_AS_MULTIBYTE (prev_from_byte))
>                           : ST_STRING_STYLE);
>         if (boundary_stop) goto done;
>       startinstring:
>         {
> -         nofence = state.instring != ST_STRING_STYLE;
> +         nofence = state->instring != ST_STRING_STYLE;
 
>           while (1)
>             {
> @@ -3419,7 +3405,7 @@ do { prev_from = from;                          \
>               /* Check C_CODE here so that if the char has
>                  a syntax-table property which says it is NOT
>                  a string character, it does not end the string.  */
> -             if (nofence && c == state.instring && c_code == Sstring)
> +             if (nofence && c == state->instring && c_code == Sstring)
>                 break;
 
>               switch (c_code)
> @@ -3442,7 +3428,7 @@ do { prev_from = from;                          \
>             }
>         }
>       string_end:
> -       state.instring = -1;
> +       state->instring = -1;
curlevel-> prev = curlevel->last;
>         INC_FROM;
>         if (boundary_stop) goto done;
> @@ -3461,25 +3447,96 @@ do { prev_from = from;                                
> \
>   stop:   /* Here if stopping before start of sexp. */
>    from = prev_from;    /* We have just fetched the char that starts it; */
>    from_byte = prev_from_byte;
> +  prev_from_syntax = prev_prev_from_syntax;
>    goto done; /* but return the position before it. */
 
>   endquoted:
> -  state.quoted = 1;
> +  state->quoted = 1;
>   done:
> -  state.depth = depth;
> -  state.mindepth = mindepth;
> -  state.thislevelstart = curlevel->prev;
> -  state.prevlevelstart
> +  state->depth = depth;
> +  state->mindepth = mindepth;
> +  state->thislevelstart = curlevel->prev;
> +  state->prevlevelstart
>      = (curlevel == levelstart) ? -1 : (curlevel - 1)->last;
> -  state.location = from;
> -  state.location_byte = from_byte;
> -  state.levelstarts = Qnil;
> +  state->location = from;
> +  state->location_byte = from_byte;
> +  state->levelstarts = Qnil;
>    while (curlevel > levelstart)
> -    state.levelstarts = Fcons (make_number ((--curlevel)->last),
> -                            state.levelstarts);
> +    state->levelstarts = Fcons (make_number ((--curlevel)->last),
> +                                state->levelstarts);
> +  state->prev_syntax = (SYNTAX_FLAGS_COMSTARTEND_FIRST (prev_from_syntax)
> +                        || state->quoted) ? prev_from_syntax : Smax;
>    immediate_quit = 0;
> +}
> +
> +/* Convert a (lisp) parse state to the internal form used in
> +   scan_sexps_forward.  */
> +static void
> +internalize_parse_state (Lisp_Object external, struct lisp_parse_state 
> *state)
> +{
> +  Lisp_Object tem;
> +
> +  if (NILP (external))
> +    {
> +      state->depth = 0;
> +      state->instring = -1;
> +      state->incomment = 0;
> +      state->quoted = 0;
> +      state->comstyle = 0;   /* comment style a by default.  */
> +      state->comstr_start = -1;      /* no comment/string seen.  */
> +      state->levelstarts = Qnil;
> +      state->prev_syntax = Smax;
> +    }
> +  else
> +    {
> +      tem = Fcar (external);
> +      if (!NILP (tem))
> +     state->depth = XINT (tem);
> +      else
> +     state->depth = 0;
> +
> +      external = Fcdr (external);
> +      external = Fcdr (external);
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      /* Check whether we are inside string_fence-style string: */
> +      state->instring = (!NILP (tem)
> +                         ? (CHARACTERP (tem) ? XFASTINT (tem) : 
> ST_STRING_STYLE)
> +                         : -1);
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->incomment = (!NILP (tem)
> +                          ? (INTEGERP (tem) ? XINT (tem) : -1)
> +                          : 0);
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->quoted = !NILP (tem);
 
> -  *stateptr = state;
> +      /* if the eighth element of the list is nil, we are in comment
> +      style a.  If it is non-nil, we are in comment style b */
> +      external = Fcdr (external);
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->comstyle = (NILP (tem)
> +                         ? 0
> +                         : (RANGED_INTEGERP (0, tem, ST_COMMENT_STYLE)
> +                            ? XINT (tem)
> +                            : ST_COMMENT_STYLE));
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->comstr_start =
> +     RANGED_INTEGERP (PTRDIFF_MIN, tem, PTRDIFF_MAX) ? XINT (tem) : -1;
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->levelstarts = tem;
> +
> +      external = Fcdr (external);
> +      tem = Fcar (external);
> +      state->prev_syntax = NILP (tem) ? Smax : XINT (tem);
> +    }
>  }
 
>  DEFUN ("parse-partial-sexp", Fparse_partial_sexp, Sparse_partial_sexp, 2, 6, 
> 0,
> @@ -3488,6 +3545,7 @@ Parsing stops at TO or when certain criteria are met;
>   point is set to where parsing stops.
>  If fifth arg OLDSTATE is omitted or nil,
>   parsing assumes that FROM is the beginning of a function.
> +
>  Value is a list of elements describing final state of parsing:
>   0. depth in parens.
>   1. character address of start of innermost containing list; nil if none.
> @@ -3501,16 +3559,22 @@ Value is a list of elements describing final state of 
> parsing:
>   6. the minimum paren-depth encountered during this scan.
>   7. style of comment, if any.
>   8. character address of start of comment or string; nil if not in one.
> - 9. Intermediate data for continuation of parsing (subject to change).
> + 9. List of positions of currently open parens, outermost first.
> +10. When the last position scanned holds the first character of a
> +    (potential) two character construct, the syntax of that position,
> +    otherwise nil.  That construct can be a two character comment
> +    delimiter or an Escaped or Char-quoted character.
> +11..... Possible further internal information used by `parse-partial-sexp'.
> +
>  If third arg TARGETDEPTH is non-nil, parsing stops if the depth
>  in parentheses becomes equal to TARGETDEPTH.
> -Fourth arg STOPBEFORE non-nil means stop when come to
> +Fourth arg STOPBEFORE non-nil means stop when we come to
>   any character that starts a sexp.
>  Fifth arg OLDSTATE is a list like what this function returns.
>   It is used to initialize the state of the parse.  Elements number 1, 2, 6
>   are ignored.
> -Sixth arg COMMENTSTOP non-nil means stop at the start of a comment.
> - If it is symbol `syntax-table', stop after the start of a comment or a
> +Sixth arg COMMENTSTOP non-nil means stop after the start of a comment.
> + If it is the symbol `syntax-table', stop after the start of a comment or a
>   string, or after end of a comment or a string.  */)
>    (Lisp_Object from, Lisp_Object to, Lisp_Object targetdepth,
>     Lisp_Object stopbefore, Lisp_Object oldstate, Lisp_Object commentstop)
> @@ -3527,15 +3591,17 @@ Sixth arg COMMENTSTOP non-nil means stop at the start 
> of a comment.
>      target = TYPE_MINIMUM (EMACS_INT);       /* We won't reach this depth.  
> */
 
>    validate_region (&from, &to);
> +  internalize_parse_state (oldstate, &state);
>    scan_sexps_forward (&state, XINT (from), CHAR_TO_BYTE (XINT (from)),
>                     XINT (to),
> -                   target, !NILP (stopbefore), oldstate,
> +                   target, !NILP (stopbefore),
>                     (NILP (commentstop)
>                      ? 0 : (EQ (commentstop, Qsyntax_table) ? -1 : 1)));
 
>    SET_PT_BOTH (state.location, state.location_byte);
 
> -  return Fcons (make_number (state.depth),
> +  return
> +    Fcons (make_number (state.depth),
>          Fcons (state.prevlevelstart < 0
>                 ? Qnil : make_number (state.prevlevelstart),
>            Fcons (state.thislevelstart < 0
> @@ -3553,11 +3619,15 @@ Sixth arg COMMENTSTOP non-nil means stop at the start 
> of a comment.
>                                 ? Qsyntax_table
>                                 : make_number (state.comstyle))
>                              : Qnil),
> -                           Fcons (((state.incomment
> -                                    || (state.instring >= 0))
> -                                   ? make_number (state.comstr_start)
> -                                   : Qnil),
> -                                  Fcons (state.levelstarts, Qnil))))))))));
> +                      Fcons (((state.incomment
> +                                  || (state.instring >= 0))
> +                                 ? make_number (state.comstr_start)
> +                                 : Qnil),
> +                        Fcons (state.levelstarts,
> +                             Fcons (state.prev_syntax == Smax
> +                                    ? Qnil
> +                                    : make_number (state.prev_syntax),
> +                                Qnil)))))))))));
>  }
>  
>  void

> _______________________________________________
> Emacs-diffs mailing list
> address@hidden
> https://lists.gnu.org/mailman/listinfo/emacs-diffs



reply via email to

[Prev in Thread] Current Thread [Next in Thread]