[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] (regex) nested complemented character class list
From: |
Stepan Kasal |
Subject: |
Re: [bug-gawk] (regex) nested complemented character class list |
Date: |
Fri, 28 Mar 2003 21:30:25 +0100 |
User-agent: |
Mutt/1.2.5.1i |
Hello,
thank you for caring about GNU awk.
On Fri, Mar 28, 2003 at 12:46:00AM -0500, Aaron S. Hawley wrote:
> known regex bug(s)?
As far as I know, the char lists cannot be nested.
Character class (eg. `[:alpha:]' is syntacticly as one letter,
and must be inside brackets).
So /^[[:alpha:][:digit:]]$/ is roughly equivalent to
/^[a-zA-Z0-9]$/ but doesn't contain any ``nested char lists''
or ``nested char classes.''
> %> gawk --version
> GNU Awk 3.1.2
>
> %> echo 'foobar' | gawk '{ sub( /[[^a]r]*$/, ""); print; }'
> foob
the regex mens: "[a^[]" (letter `a', `^' or `['), followed by "r",
followed by "]*" and end-of-line.
Observe:
$ echo 'foobar]]]' | ./gawk '{ sub( /[[^a]r]*$/, ""); print; }'
foob
>
> %> echo 'foobar' | gawk '{ sub( /[r[^a]]*$/, ""); print; }'
> fooba
Again, [ar[^], followed by optional ]'s and end-of-line.
This explains the following example:
> %> echo 'foobarr' | gawk '{ sub(/[r[^a]]*$/, ""); print; }'
> foobar
> %> echo 'foo://food/)' | gawk '{ sub( /[[^/][:punct:]]*$/, ""); print; }'
Again, no nesting is possible, but the [: and :] which delimit the
char class inside the char list are different from [ and ] as char list
delimiters.
Correct syntax for ``punctuation or A'' would be /[A[:punct:]]/, not
/[[A][:punct:]]/.
I'm afraid there is no easy way to express ``punctuation but not /''
which is what you probably wanted.
(Of course there are ways to achive the desired substitution in awk
but I realize it's not the topic of this mail.)
Hope this clarifies it,
Stepan Kasal