[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] 4.0 beta1, character lists broken
From: |
Aharon Robbins |
Subject: |
Re: [bug-gawk] 4.0 beta1, character lists broken |
Date: |
Sun, 29 May 2011 22:46:05 +0300 |
User-agent: |
Heirloom mailx 12.4 7/29/08 |
Hi. Thanks for the bug report.
> Date: Fri, 27 May 2011 16:45:33 +0200
> From: Juergen Daubert <address@hidden>
> To: address@hidden
>
> Hello,
>
> first a big thanks for the new gawk with lots of nice new features.
>
> I've done some first tests, it looks like the handling of character
> lists is partially broken with the new 4.0 beta:
>
> $:~> echo 'a' | awk '/[\001-\177]/'
> awk: cmd. line:1: fatal: add_char: *bufp: can't allocate -1879052298 bytes of
> memory (Cannot allocate memory)
This is a bug. See the patch below, which will shortly be in the
git repo.
> $:~> echo 'a' | awk '/[\134]/'
> awk: cmd. line:1: error: Unmatched [ or [^: /[\]/
This is not a bug:
$ echo 'a' | gawk-3.1.8 '/[\134]/'
gawk-3.1.8: fatal: Unmatched [ or [^: /[\]/
Octal 134 is a backslash, and thus the diagnostic is correct; there is
no closing ] character.
> $:~> echo '\' | awk '/[\001-\176]/'
> $:~>
Fixed, now.
> All of the above works with gawk 3.1.8 as expected.
Well, except for the case above with \134.
> This is on a
> almost up-to-date Linux system with glibc 2.12.2 and gcc 4.5.3.
>
> thanks
> Juergen
Thanks for the report. Here is a patch, which fixes an additional
problem reported by John Haque.
diff --git a/re.c b/re.c
index 691955f..b317b09 100644
--- a/re.c
+++ b/re.c
@@ -643,6 +643,7 @@ add_char(char **bufp, size_t *lenp, char ch, char **ptr)
erealloc(*bufp, char *, newlen + 2, "add_char");
*ptr = *bufp + offset;
**ptr = ch;
+ *lenp = newlen + 2;
(*ptr)++;
}
@@ -714,7 +715,7 @@ again:
/* inside [...] but not inside [[:...:]] */
if (*sp == '-') {
int start, end;
- char i;
+ int i;
if (sp[1] == ']') { /* also literal */
copy();
@@ -728,8 +729,18 @@ again:
len--;
}
end = sp[1];
- for (i = start + 1; i <= end; i++)
+ if (end < start)
+ fatal(_("Invalid range end: /%.*s/"),
+ *lenp, s);
+ for (i = start + 1; i < end; i++) {
+ /*
+ * Will the special cases never end?
+ */
+ if (i == '\\' || i == ']') {
+ copych('\\');
+ }
copych(i);
+ }
sp++;
len--;
continue;