bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

GNU grep back references


From: Jan Schampera
Subject: GNU grep back references
Date: Mon, 10 Oct 2005 07:12:13 +0200

Hello there,

Grep seems to forget back references to \(\)-expressions when reading a
new line of input. 
I'm sure this is useful sometimes, but that's not the behaviour i'd
expect when reading the IEEE Std 1003.1, 2004 Edition (I'm sure it was
there, earlier, maybe in POSIX basic documents, too, just can't find it
by now):

"The back-reference expression '\n' shall match the same (possibly
empty) string of characters as was matched by a subexpression enclosed
between "\(" and "\)" preceding the '\n'." [IEEE1003.1-BRE]

The SVR4 grep utility (usually /usr/xpg4/bin/grep) acts as expected, it
"remembers" the first \(\)-expression for its back reference, regardless
how much input it reads.

You see it for example when grep matches lines like
"\([[:digit:]]\)\{2\}\1"
The GNU grep behaviour is to match every line that looks like (letters
are digits): "ABAB", "CDCD", "EFEF".

The grep fro the xpg4 package matches only the lines that contain (using
the back reference) the very first \(\)-expression's literal content.

Summary:
GNU grep "forgets" back references on every new line of input
xpg4 grep "remembers" the very first matched content over all its input

Any comments to this behaviour?

Best regards and thanks for your work,
Jan

[IEEE1003.1-BRE]
The Open Group Base Specifications Issue 6
IEEE Std 1003.1, 2004 Edition
Chapter 9.3.6, Paragraph 3

-- 
dreaming in digital
living in realtime
thinking in binary
talking in IP

WELCOME TO OUR WORLD





reply via email to

[Prev in Thread] Current Thread [Next in Thread]