[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#22655: grep -Pz '^' now fails!
From: |
Stephane Chazelas |
Subject: |
bug#22655: grep -Pz '^' now fails! |
Date: |
Sat, 19 Nov 2016 10:41:27 +0000 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
2016-11-18 15:37:16 -0800, Paul Eggert:
[...]
> >That might have been the case a long time ago, as I remember
> >some discussion about it as it explained some wrong information
> >in the documentation, but as far as I and gdb can tell, grep
> >2.26 at least call pcre_exec for every line of the input with
> >grep -P.
> >
>
> Although that was true starting with commit
> a14685c2833f7c28a427fecfaf146e0a861d94ba (2010-03-04), it became
> false starting with commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5
> (2014-09-16).
[...]
OK, it looks like I don't have the full story, and my multiple
calls to pcre_exec() seems to point to something else:
$ seq 10 | ltrace -e '*pcre*' ./src/grep -P .
grep->pcre_maketables(0x221e2f0, 0x221e240, 1, 2)
= 0x221e310
grep->pcre_compile(0x221e2f0, 2050, 0x7ffe943ec6f8, 0x7ffe943ec6f4)
= 0x221e760
grep->pcre_study(0x221e760, 1, 0x7ffe943ec6f8, 0x7ffe943eb490)
= 0x221e7b0
grep->pcre_fullinfo(0x221e760, 0x221e7b0, 16, 0x7ffe943ec6f4)
= 0
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 128, 0x7ffe943ec700, 300)
= -1
grep->pcre_exec(0x221e760, 0x221e7b0, "", 0, 0, 0, 0x7ffe943ec700, 300)
= -1
grep->pcre_exec(0x221e760, 0x221e7b0, "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n", 20, 0,
8192, 0x7ffe943ec4e0, 300) = 1
1
grep->pcre_exec(0x221e760, 0x221e7b0, "2\n3\n4\n5\n6\n7\n8\n9\n10\n", 18, 0,
8192, 0x7ffe943ec4e0, 300) = 1
2
grep->pcre_exec(0x221e760, 0x221e7b0, "3\n4\n5\n6\n7\n8\n9\n10\n", 16, 0, 8192,
0x7ffe943ec4e0, 300) = 1
3
grep->pcre_exec(0x221e760, 0x221e7b0, "4\n5\n6\n7\n8\n9\n10\n", 14, 0, 8192,
0x7ffe943ec4e0, 300) = 1
4
grep->pcre_exec(0x221e760, 0x221e7b0, "5\n6\n7\n8\n9\n10\n", 12, 0, 8192,
0x7ffe943ec4e0, 300) = 1
5
grep->pcre_exec(0x221e760, 0x221e7b0, "6\n7\n8\n9\n10\n", 10, 0, 8192,
0x7ffe943ec4e0, 300) = 1
6
grep->pcre_exec(0x221e760, 0x221e7b0, "7\n8\n9\n10\n", 8, 0, 8192,
0x7ffe943ec4e0, 300) = 1
7
grep->pcre_exec(0x221e760, 0x221e7b0, "8\n9\n10\n", 6, 0, 8192, 0x7ffe943ec4e0,
300) = 1
8
grep->pcre_exec(0x221e760, 0x221e7b0, "9\n10\n", 4, 0, 8192, 0x7ffe943ec4e0,
300) = 1
9
grep->pcre_exec(0x221e760, 0x221e7b0, "10\n", 2, 0, 8192, 0x7ffe943ec4e0, 300)
= 1
10
+++ exited (status 0) +++
I don't know the details of why it's done that way, but I'm not
sure I can see how calling pcre_exec that way can be quicker
than calling it on each individual line/record.
Note that this is still wrong:
$ printf 'a\nb\0' | ./src/grep -zxP a
a
b
Removing PCRE_MULTILINE (and get back to calling pcre_exec on
every record separately) would help except in the cases where the
user does:
grep -xzP '(?m)a'
You'd want to change:
static char const xprefix[] = "^(?:";
static char const xsuffix[] = ")$";
To:
static char const xprefix[] = "\A(?:";
static char const xsuffix[] = ")\z";
--
Stephane
- bug#22655: grep -Pz '^' now fails!, (continued)
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Zev Weiss, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/18
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/18
- bug#22655: grep -Pz '^' now fails!,
Stephane Chazelas <=
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Aaron Crane, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Stephane Chazelas, 2016/11/19
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/20
- bug#22655: grep -Pz '^' now fails!, Paul Eggert, 2016/11/20