emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#51458: closed (grep PCRE - mean)


From: GNU bug Tracking System
Subject: bug#51458: closed (grep PCRE - mean)
Date: Tue, 09 Nov 2021 18:06:02 +0000

Your message dated Tue, 9 Nov 2021 10:05:31 -0800
with message-id <4791ff36-8afe-5aad-f4b3-c02b5948acc1@cs.ucla.edu>
and subject line Re: bug#51458: grep PCRE - '^' and '$' are not recognized as 
begin and end of line for multiline strings
has caused the debbugs.gnu.org bug report #51458,
regarding grep PCRE - mean
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
51458: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=51458
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: grep PCRE - mean Date: Thu, 28 Oct 2021 08:23:08 +0000

Hello Grep Team,

I would update grep from version 2.20 to 3.1 and noticed that grep with -P option

stops recognize below regular _expression_:

 

cat SomeTestFile.cpp | sed -r -e 's:\/(\*([^*]|\*[^\/])*[*]\/|\/.*)::g' -e 's:\"[^"]*\"::g' |

grep -ozPLq '\A(?:\s*^(?:#\w+.*\s*|extern\s+.+)$)*+(?<namespace>\s*namespace(?:\s+ utTestNamespace \s*(?>(?<block>{(?:[^{}]*(?&block)*)*}))|(\s*[\w:]*\s*{)(?&namespace)\s*}))\s*\z'; echo "retcode $?"

 

Content of file SomeTestFile.cpp:

#include <memory>

#include <vector>

#include <gtest/gtest.h>

 

namespace utTestNamespace

{

using ::testing::NiceMock;

# some code here

}

//end of file

 

 

I checked regular _expression_ on regex101.com webpage and noticed that mentioned regex is working for PCRE and PCRE2 on webpage but stop working in grep 3.1 and later versions (versions between 2.20 and 3.1 were not checked).

See link:

https://regex101.com/r/9NwluI/1/

 

Investigation shows that grep in 3.1 version and later 3.6 and 3.7 different handle “^” and “$” for “-P” option.

It looks that “^” does not detect all begin of lines but “$” does not recognize all end of lines.

 

It seems that “^” is treated as beginning of whole test string - not new lines.

“$” is suspected to recognize only end of whole test string – not end of lines.

 

I would ask you if is intended behavior or it looks like an issue in grep.

 

useful command in test:

cat SomeTestFile.cpp | sed -r -e 's:\/(\*([^*]|\*[^\/])*[*]\/|\/.*)::g' -e 's:\"[^"]*\"::g' | grep -zP '(?:\s*^(?:\#\w+.*\s*|extern\s+.+)$)*+'

cat SomeTestFile.cpp | sed -r -e 's:\/(\*([^*]|\*[^\/])*[*]\/|\/.*)::g' -e 's:\"[^"]*\"::g' | grep -zP '(?:\s*^(?:\#\w+.*\s*|extern\s+.+)\s*)*+'

 

 

Best Regards,

SÅ‚awek

 


--- End Message ---
--- Begin Message --- Subject: Re: bug#51458: grep PCRE - '^' and '$' are not recognized as begin and end of line for multiline strings Date: Tue, 9 Nov 2021 10:05:31 -0800 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.2.1
On 11/8/21 22:48, Skrzyniarz, Slawomir (Nokia - PL/Krakow) wrote:
Solve my issue.

Thanks for letting us know; closing the bug report.


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]