bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grep bug ??


From: Bob Proulx
Subject: Re: grep bug ??
Date: Sun, 9 Mar 2003 10:44:19 -0700
User-agent: Mutt/1.3.28i

josX wrote:
> -FILE-
> `a
> ``a
>  `a
>  ``a
> -ENDFILE-
> % egrep '[^`]`a' FILE
>  `a

Thank you for your report.  With enough eyes all bugs are transparent.

First let me start small and work up to your problem.  The character
class [^`] means match any character which is not a '`' characters.
The [`] class would, of course mean any '`' character and the '^'
inside the class inverts the class to mean the opposite match.  It
will match any character which is not a '`' character.

Therefore [^`] _must_ match at least one character.  The first line of
your file which you show later as wanting to match begins with a '`'
character and so cannot match a non '`' character.

> I've worked around this by doing:
> % egrep '[^`]`a|^`a' FILE
> `a
>  `a

That's fine.  However, I would not repeat the bulk of the pattern.
Let me suggest that it would be better to say directly what you want.
You appear to desire to match any line that is either the beginning of
the line or a character which is not a '`' character, followed by your
pattern.  (And since I am a fan of 'grep -E' instead of 'egrep' I will
show it that way.  They are equivalent.)

  grep -E '(^|[^`])`a' FILE
  `a
   `a

However, my own thoughts work on the problem this way.  You appear to
want to ignore leading whitespace and then match your pattern.  Is
this really what you are looking for?

  grep '^ *`a' FILE
  `a
   `a

But I can improve the ' *' one better by using the [:space:] character
class.

  grep '^[[:space:]]*`a' FILE
  `a
   `a

I expect the [[:space:]] will throw some people, especially the number
of brackets, so let me explain that construct in a little detail.
Saying '[[:space:]]*' is almost the same as saying ' *'.  The outside
brackets indicate a character class within.  '[:space:]' is a
predefined character class to mean whitespace including space, tab,
etc. meaning other whitespace as defined by your locale and therefore
supports non-ascii character sets.  It is a better way of saying ' '.
By using a character class '[]' and putting the name '[:space:]'
inside it you get '[[:space:]]' which matches any single whitespace.
There are several predefined character classes.  You will find them
listed in the documentation.  It is often a good idea to use them in
your patterns.

Hope that helps.

Bob




reply via email to

[Prev in Thread] Current Thread [Next in Thread]