bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] (regex) nested complemented character class list


From: Stepan Kasal
Subject: Re: [bug-gawk] (regex) nested complemented character class list
Date: Fri, 28 Mar 2003 21:30:25 +0100
User-agent: Mutt/1.2.5.1i

Hello,
        thank you for caring about GNU awk.

On Fri, Mar 28, 2003 at 12:46:00AM -0500, Aaron S. Hawley wrote:
> known regex bug(s)?

As far as I know, the char lists cannot be nested.
Character class (eg. `[:alpha:]' is syntacticly as one letter,
and must be inside brackets).
So /^[[:alpha:][:digit:]]$/ is roughly equivalent to
/^[a-zA-Z0-9]$/ but doesn't contain any ``nested char lists''
or ``nested char classes.''

> %> gawk --version
> GNU Awk 3.1.2
> 
> %> echo 'foobar' | gawk '{ sub( /[[^a]r]*$/, ""); print; }'
> foob

the regex mens: "[a^[]" (letter `a', `^' or `['), followed by "r",
followed by "]*" and end-of-line.

Observe:
$ echo 'foobar]]]' | ./gawk '{ sub( /[[^a]r]*$/, ""); print; }'
foob

> 
> %> echo 'foobar' | gawk '{ sub( /[r[^a]]*$/, ""); print; }'
> fooba

Again, [ar[^], followed by optional ]'s and end-of-line.
This explains the following example:

> %> echo 'foobarr' | gawk '{ sub(/[r[^a]]*$/, ""); print; }'
> foobar

> %> echo 'foo://food/)' | gawk '{ sub( /[[^/][:punct:]]*$/, ""); print; }'

Again, no nesting is possible, but the [: and :] which delimit the
char class inside the char list are different from [ and ] as char list
delimiters.

Correct syntax for ``punctuation or A'' would be /[A[:punct:]]/, not
/[[A][:punct:]]/.

I'm afraid there is no easy way to express ``punctuation but not /''
which is what you probably wanted.

(Of course there are ways to achive the desired substitution in awk
but I realize it's not the topic of this mail.)

Hope this clarifies it,
        Stepan Kasal




reply via email to

[Prev in Thread] Current Thread [Next in Thread]