bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/4] dfa: change meaning of a state context


From: Jim Meyering
Subject: Re: [PATCH 1/4] dfa: change meaning of a state context
Date: Sat, 25 Feb 2012 19:21:14 +0100

Paolo Bonzini wrote:
> * src/dfa.c (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT): New.
> (state_separate_contexts): Remove second argument.
> (state_index): Do not mask away CTX_NONE.
> (dfaanalyze): Adjust call to state_index and state_separate_contexts.
> (dfastate): Adjust calls to state_index and state_separate_contexts.
> ---

Hi Paolo,
Sorry about the delay.
The substantial parts of this patch are impeccable and pass my tests.
Please make the "const"-adding change mentioned below and consider
the style comments.

> -  context &= ~CTX_NONE;
>    for (i = 0; i < s->nelem; ++i)
>      hash ^= s->elems[i].index + s->elems[i].constraint;
>
> @@ -2102,12 +2105,12 @@ epsclosure (position_set *s, struct dfa const *d)
>     character included in C.  */

Several of these changes are purely stylistic, merely inserting
a space before an opening parenthesis.  They tend to obscure the
real changes you are making and belong in a separate change set.
You may go ahead and commit these, but going forward, please avoid
mixing stylistic and substantive changes in the same c-set.

>  static int
> -charclass_context(charclass c)
> +charclass_context (charclass c)
>  {
>    int context = 0;
>    unsigned int j;
>
> -  if (tstbit(eolbyte, c))
> +  if (tstbit (eolbyte, c))
>      context |= CTX_NEWLINE;
>
>    for (j = 0; j < CHARCLASS_INTS; ++j)
> @@ -2121,29 +2124,27 @@ charclass_context(charclass c)
>    return context;
>  }
>
> -/* Returns the subset of POSSIBLE_CONTEXTS on which the position set S
> -   depends.  Each context in the set of returned contexts (let's call it
> -   SC) may have a different follow set than other contexts in SC, and
> -   also different from the follow set of the complement set.  However,
> -   all contexts in the complement set will have the same follow set.  */
> +/* Returns the contexts on which the position set S depends.  Each context
> +   in the set of returned contexts (let's call it SC) may have a different
> +   follow set than other contexts in SC, and also different from the
> +   follow set of the complement set (sc ^ CTX_ANY).  However, all contexts
> +   in the complement set will have the same follow set.  */
>
>  static int
> -state_separate_contexts(position_set *s, int possible_contexts)
> +state_separate_contexts(position_set *s)
>  {
> -  int separate_context = 0;
> +  int separate_contexts = 0;
>    unsigned int j;
>
>    for (j = 0; j < s->nelem; ++j)
>      {
> -      if ((possible_contexts & CTX_NEWLINE)
> -          && PREV_NEWLINE_DEPENDENT(s->elems[j].constraint))
> -        separate_context |= CTX_NEWLINE;
> -      if ((possible_contexts & CTX_LETTER)
> -          && PREV_LETTER_DEPENDENT(s->elems[j].constraint))
> -        separate_context |= CTX_LETTER;
> +      if (PREV_NEWLINE_DEPENDENT (s->elems[j].constraint))
> +        separate_contexts |= CTX_NEWLINE;
> +      if (PREV_LETTER_DEPENDENT (s->elems[j].constraint))
> +        separate_contexts |= CTX_LETTER;
>      }
>
> -  return separate_context;
> +  return separate_contexts;
>  }

As you can see, the above change no longer applies, since
in the interim, gcc noticed that this function deserves the "pure"
attribute, so I added _GL_ATTRIBUTE_PURE to quiet its warning.
By the same token, for readability's sake, the pointer
argument should be declared "const".  Please adjust your patch
like this:

diff --git a/src/dfa.c b/src/dfa.c
index 649c3a7..69792e5 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -2131,7 +2131,7 @@ charclass_context (charclass c)
    in the complement set will have the same follow set.  */

 static int _GL_ATTRIBUTE_PURE
-state_separate_contexts (position_set *s)
+state_separate_contexts (position_set const *s)
 {
   int separate_contexts = 0;
   unsigned int j;

This is the reason I plan to reformat dfa.c once your changes are in.

> @@ -2668,17 +2670,20 @@ dfastate (int s, struct dfa *d, int trans[])
>            insert(d->states[0].elems.elems[j], &follows);
>
>        /* Find out if the new state will want any context information. */
> -      possible_contexts = charclass_context(labels[i]);
> -      separate_contexts = state_separate_contexts(&follows, 
> possible_contexts);
> +      possible_contexts = charclass_context (labels[i]);
> +      separate_contexts = state_separate_contexts (&follows);
>
>        /* Find the state(s) corresponding to the union of the follows. */
> -      state = state_index(d, &follows, 0);
> -      if (separate_contexts & CTX_NEWLINE)
> -        state_newline = state_index(d, &follows, CTX_NEWLINE);
> +      if ((separate_contexts & possible_contexts) != possible_contexts)
> +        state = state_index (d, &follows, separate_contexts ^ CTX_ANY);
> +      else
> +        state = -1;
> +      if (separate_contexts & possible_contexts & CTX_NEWLINE)
> +        state_newline = state_index (d, &follows, CTX_NEWLINE);
>        else
>          state_newline = state;
> -      if (separate_contexts & CTX_LETTER)
> -        state_letter = state_index(d, &follows, CTX_LETTER);
> +      if (separate_contexts & possible_contexts & CTX_LETTER)
> +        state_letter = state_index (d, &follows, CTX_LETTER);
>        else
>          state_letter = state;
>
> @@ -2975,7 +2980,7 @@ match_mb_charset (struct dfa *d, int s, position pos, 
> int idx)
>
>    /* Match in range 0-255?  */
>    if (wc < NOTCHAR && work_mbc->cset != -1
> -      && tstbit((unsigned char)wc, d->charclasses[work_mbc->cset]))
> +      && tstbit ((unsigned char)wc, d->charclasses[work_mbc->cset]))
>      goto charset_matched;
>
>    /* match with a character class?  */



reply via email to

[Prev in Thread] Current Thread [Next in Thread]