[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gawk] [BugReport] GAWK 4.1.1 memory leak
From: |
Anurag Dubey |
Subject: |
[bug-gawk] [BugReport] GAWK 4.1.1 memory leak |
Date: |
Mon, 22 Dec 2014 04:43:22 +0530 |
Hey,
While trying to pipe a file into awk and count occurrences of a
particular pattern, my awk process starts consuming more than 8-10GB
of memory, even though the input file is less than 50MB text. Here is
the code:
sudo zcat $fullpath | sudo awk '
BEGIN { FS="@#@" }
NF != 24 { err = sprintf("%s:%d: skipped: NF != 24\n",
"'"$filename"'", FNR);
print err > "/var/log/error.log" ; next }
{
split($6,a," ");
date=a[1];
hour=substr(a[2],1,2);
cnt[date"@#@"hour"@#@"$17"@#@"$16"@#@"$22"@#@"$18"@#@"$5"@#@"$23"@#@"$24"@#@"$19"@#@"$4]++;
sum[date"@#@"hour"@#@"$17"@#@"$16"@#@"$22"@#@"$18"@#@"$5"@#@"$23"@#@"$24"@#@"$19"@#@"$4]
+= $14
}
END
{
for ( key in cnt )
{ print key"~`~"cnt[key]"~`~"sum[key];}
}
'
> $dirname"rts/"$filename.temp
Here, cnt is the counter and sum is for summing particular column
values corresponding to the pattern. On running the script, the memory
usage starts increasing initially from a few hundred MBs, later to
even more than 10GB.
OS is centos5 and version are GNU Awk 4.1.1, API: 1.1.
All the awk versions below this also face the same problem.
Regards
Anurag Dubey