bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC PATCH] fall back to glibc matcher if a multibyte match is found


From: Paolo Bonzini
Subject: [RFC PATCH] fall back to glibc matcher if a multibyte match is found
Date: Fri, 30 Apr 2010 10:34:01 +0200

This patch works around the performance problems that are still in
current grep.  Red Hat will probably be using it in its own 2.6.x.

For UTF-8 it should trigger only in the presence of MBCSET, e.g. [a-z]
or [à] (nad the latter case could be avoided).

For other character sets all brackets, and `.' as well, will trigger it.

Thoughts?
---
 src/dfa.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/src/dfa.c b/src/dfa.c
index 2bc0c0e..775943c 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -3213,6 +3213,15 @@ dfaexec (struct dfa *d, char const *begin, char *end,
                 continue;
               }
 
+           if (backref)
+              {
+                *backref = 1;
+                free(mblen_buf);
+                free(inputwcs);
+                *end = saved_end;
+                return (char *) p;
+              }
+
             /* Can match with a multibyte character (and multi character
                collating element).  Transition table might be updated.  */
             s = transit_state(d, s, &p);
-- 
1.6.6.1





reply via email to

[Prev in Thread] Current Thread [Next in Thread]