bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22655: grep -Pz '^' now fails!


From: Stephane Chazelas
Subject: bug#22655: grep -Pz '^' now fails!
Date: Sat, 19 Nov 2016 10:41:27 +0000
User-agent: Mutt/1.5.21 (2010-09-15)

2016-11-18 15:37:16 -0800, Paul Eggert:
[...]
> >That might have been the case a long time ago, as I remember
> >some discussion about it as it explained some wrong information
> >in the documentation, but as far as I and gdb can tell, grep
> >2.26 at least call pcre_exec for every line of the input with
> >grep -P.
> >
> 
> Although that was true starting with commit
> a14685c2833f7c28a427fecfaf146e0a861d94ba (2010-03-04), it became
> false starting with commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5
> (2014-09-16).
[...]

OK, it looks like I don't have the full story, and my multiple
calls to pcre_exec() seems to point to something else:

$ seq 10 | ltrace  -e '*pcre*' ./src/grep -P .
grep->pcre_maketables(0x221e2f0, 0x221e240, 1, 2)                               
                                      = 0x221e310
grep->pcre_compile(0x221e2f0, 2050, 0x7ffe943ec6f8, 0x7ffe943ec6f4)             
                                      = 0x221e760
grep->pcre_study(0x221e760, 1, 0x7ffe943ec6f8, 0x7ffe943eb490)                  
                                      = 0x221e7b0
grep->pcre_fullinfo(0x221e760, 0x221e7b0, 16, 0x7ffe943ec6f4)                   
                                      = 0
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 128, 0x7ffe943ec700, 300)       
                                      = -1
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 0, 0x7ffe943ec700, 300)         
                                      = -1
grep->pcre_exec(0x221e760, 0x221e7b0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 20, 0, 
8192, 0x7ffe943ec4e0, 300)            = 1
1
grep->pcre_exec(0x221e760, 0x221e7b0, "2\n3\n4\n5\n6\n7\n8\n9\n10\n", 18, 0, 
8192, 0x7ffe943ec4e0, 300)               = 1
2
grep->pcre_exec(0x221e760, 0x221e7b0, "3\n4\n5\n6\n7\n8\n9\n10\n", 16, 0, 8192, 
0x7ffe943ec4e0, 300)                  = 1
3
grep->pcre_exec(0x221e760, 0x221e7b0, "4\n5\n6\n7\n8\n9\n10\n", 14, 0, 8192, 
0x7ffe943ec4e0, 300)                     = 1
4
grep->pcre_exec(0x221e760, 0x221e7b0, "5\n6\n7\n8\n9\n10\n", 12, 0, 8192, 
0x7ffe943ec4e0, 300)                        = 1
5
grep->pcre_exec(0x221e760, 0x221e7b0, "6\n7\n8\n9\n10\n", 10, 0, 8192, 
0x7ffe943ec4e0, 300)                           = 1
6
grep->pcre_exec(0x221e760, 0x221e7b0, "7\n8\n9\n10\n", 8, 0, 8192, 
0x7ffe943ec4e0, 300)                               = 1
7
grep->pcre_exec(0x221e760, 0x221e7b0, "8\n9\n10\n", 6, 0, 8192, 0x7ffe943ec4e0, 
300)                                  = 1
8
grep->pcre_exec(0x221e760, 0x221e7b0, "9\n10\n", 4, 0, 8192, 0x7ffe943ec4e0, 
300)                                     = 1
9
grep->pcre_exec(0x221e760, 0x221e7b0, "10\n", 2, 0, 8192, 0x7ffe943ec4e0, 300)  
                                      = 1
10
+++ exited (status 0) +++

I don't know the details of why it's done that way, but I'm not
sure I can see how calling pcre_exec that way can be quicker
than calling it on each individual line/record.


Note that this is still wrong:

$ printf 'a\nb\0' | ./src/grep -zxP a
a
b

Removing PCRE_MULTILINE (and get back to calling pcre_exec on
every record separately) would help except in the cases where the
user does:

grep -xzP '(?m)a'

You'd want to change:

  static char const xprefix[] = "^(?:";
  static char const xsuffix[] = ")$";

To:

  static char const xprefix[] = "\A(?:";
  static char const xsuffix[] = ")\z";


-- 
Stephane





reply via email to

[Prev in Thread] Current Thread [Next in Thread]