[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC/RFT PATCH 00/11] Differentiate ^/$ from \` and \' in grep -z mode

From: Paolo Bonzini
Subject: [RFC/RFT PATCH 00/11] Differentiate ^/$ from \` and \' in grep -z mode
Date: Wed, 4 Jan 2012 11:59:41 +0100

Hi all,

this series is a pretty heavy refactoring of how anchors work in dfa.c.
The main objective is to implement ^, $, \` and \' correctly when grep
-z is in use.  In particular, ^ and $ will match a newline character in
the middle of a NULL-delimited sequence.  This is backwards-incompatible.

It is still not ready for committing, in particular I have not yet added
tests and I haven't worked out how the period character should work.
However, having other people hammering on it would be very useful,
for both "grep -z" and regular grep.

Patch 1 fixes an unrelated bug that I reported yesterday.

Patch 2 introduces symbolic values for the values of "sbit" and
"d->success", and patches 3/4 use those values extensively throughout
dfa.c, replacing separate variables or arguments.  This is because later
in the series an additional value is added.

Patch 5 gives a more easily defined meaning to the context field of a
DFA state, and one that I'm more comfortable with hacking on.

Patches 6 and 7 are simplifications in the code.

Patch 8 reimplements constraints so that I can get room for buffer
constraints (\` and \').

Patch 9 renames the "newline character" concept to "buffer delimiter",
since later patches modify ^ and $ to anchor against a hardcoded \n.

Patches 10 and 11 introduce the new feature, respectively in the matcher
and in the regex parser.

Paolo Bonzini (11):
  dfa: fix corner case with anchors
  dfa: introduce contexts for the values in d->success
  dfa: change newline/letter to a single context value
  dfa: refactor common context computations
  dfa: change meaning of a state context
  dfa: remove useless check
  dfa: make repetitive code *really* repetitive
  dfa: remove redundant line constraints
  dfa: rename "newline" to "buffer delimiter"
  dfa: introduce bufdelim context
  dfa: introduce BEGBUF/ENDBUF

 NEWS                 |    5 +
 src/dfa.c            |  508 ++++++++++++++++++++++++++++++--------------------
 tests/spencer1.tests |   12 ++
 3 files changed, 323 insertions(+), 202 deletions(-)


reply via email to

[Prev in Thread] Current Thread [Next in Thread]