--- Begin Message ---
Subject: |
grep -wP and backreferences |
Date: |
Mon, 24 Feb 2014 10:01:54 +0000 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hello,
Backreferences don't work with -w or -x in combination with -P:
$ echo aa | grep -Pw '(.)\1'
$
Or they work in an unexpected way:
$ echo aa | grep -Pw '(.)\2'
aa
The fix is simple:
--- src/pcresearch.c~ 2014-02-24 09:59:56.864374362 +0000
+++ src/pcresearch.c 2014-02-24 07:33:04.666398105 +0000
@@ -75,9 +75,9 @@ Pcompile (char const *pattern, size_t si
*n = '\0';
if (match_lines)
- strcpy (n, "^(");
+ strcpy (n, "^(?:");
if (match_words)
- strcpy (n, "\\b(");
+ strcpy (n, "\\b(?:");
n += strlen (n);
/* The PCRE interface doesn't allow NUL bytes in the pattern, so
--- End Message ---
--- Begin Message ---
Subject: |
Re: bug#16865: grep -wP and backreferences |
Date: |
Tue, 25 Feb 2014 10:03:28 -0800 |
On Tue, Feb 25, 2014 at 8:08 AM, Stephane Chazelas
<address@hidden> wrote:
> 2014-02-24 20:55:42 -0800, Jim Meyering:
>> On Mon, Feb 24, 2014 at 1:20 PM, Stephane Chazelas
>> <address@hidden> wrote:
>> > A last note: with -w, pcregrep wraps the regexp in \b...\b
>> > instead of \b(?:...)\b, so it could be that those brackets are
>> > not necessary in the first place.
>
> The brackets are actually needed in cases like:
>
> grep -Pw 'foo|bar'
>
> (pcregrep has a bug there).
>
>
>> > Maybe instead of \b(?:...)\b, we could use (?<!\w)...(?!\w)
>> >
>> > $ echo a%%b | grep -P '(?<!\w)%%(?!\w)'
>> > $ echo %aa% | grep -P '(?<!\w)aa(?!\w)'
>> > %aa%
>>
>> I like both suggestions. Making -wP work like grep's -w makes perfect sense.
>> Care to prepare a patch to make it do that, with a separate test case?
>> "git format-patch ..." output preferred, if you're game.
>>
>> I pushed the above patch, but would welcome another one.
>
> Please find the patch attached.
Thank you very much. Nearly perfect.
I've uncapitalized the 1-line summary, changed a That to This
in the log, and added examples to NEWS, and added an empty
line to restore the 2-empty-line section delimiter.
> (note that tests/word-delim-multibyte fails for me, but it's not
> my doing, it was failing before).
That's an XFAIL test (as noted in tests/Makefile.am), hence, expected
to fail, and as long as it fails as expected, "make check" can still succeed.
I've closed this ticket, and will push once you ack these changes.
0001-align-grep-Pw-with-grep-w.patch
Description: Binary data
--- End Message ---