bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] [BugReport] GAWK 4.1.1 memory leak


From: Anurag Dubey
Subject: [bug-gawk] [BugReport] GAWK 4.1.1 memory leak
Date: Mon, 22 Dec 2014 04:43:22 +0530

Hey,

While trying to pipe a file into awk and count occurrences of a
particular pattern, my awk process starts consuming more than 8-10GB
of memory, even though the input file is less than 50MB text. Here is
the code:

sudo zcat $fullpath | sudo awk '
BEGIN { FS="@#@" }

        NF != 24 { err = sprintf("%s:%d: skipped: NF != 24\n",
"'"$filename"'", FNR);
        print err > "/var/log/error.log" ; next }
                    {
                        split($6,a," ");
                        date=a[1];
                        hour=substr(a[2],1,2);
          
cnt[date"@#@"hour"@#@"$17"@#@"$16"@#@"$22"@#@"$18"@#@"$5"@#@"$23"@#@"$24"@#@"$19"@#@"$4]++;

sum[date"@#@"hour"@#@"$17"@#@"$16"@#@"$22"@#@"$18"@#@"$5"@#@"$23"@#@"$24"@#@"$19"@#@"$4]
+= $14
                }

                END
                {
                for ( key in cnt  )
                { print key"~`~"cnt[key]"~`~"sum[key];}
                }
'
                > $dirname"rts/"$filename.temp

Here, cnt is the counter and sum is for summing particular column
values corresponding to the pattern. On running the script, the memory
usage starts increasing initially from a few hundred MBs, later to
even more than 10GB.
OS is centos5 and version are GNU Awk 4.1.1, API: 1.1.
All the awk versions below this also face the same problem.

Regards
Anurag Dubey



reply via email to

[Prev in Thread] Current Thread [Next in Thread]