[bug-gawk] [BugReport] GAWK 4.1.1 memory leak

bug-gawk

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gawk] [BugReport] GAWK 4.1.1 memory leak

From:	Anurag Dubey
Subject:	[bug-gawk] [BugReport] GAWK 4.1.1 memory leak
Date:	Mon, 22 Dec 2014 04:43:22 +0530

Hey,

While trying to pipe a file into awk and count occurrences of a
particular pattern, my awk process starts consuming more than 8-10GB
of memory, even though the input file is less than 50MB text. Here is
the code:

sudo zcat $fullpath | sudo awk '
BEGIN { FS="@#@" }

        NF != 24 { err = sprintf("%s:%d: skipped: NF != 24\n",
"'"$filename"'", FNR);
        print err > "/var/log/error.log" ; next }
                    {
                        split($6,a," ");
                        date=a[1];
                        hour=substr(a[2],1,2);
          
cnt[date"@#@"hour"@#@"$17"@#@"$16"@#@"$22"@#@"$18"@#@"$5"@#@"$23"@#@"$24"@#@"$19"@#@"$4]++;

sum[date"@#@"hour"@#@"$17"@#@"$16"@#@"$22"@#@"$18"@#@"$5"@#@"$23"@#@"$24"@#@"$19"@#@"$4]
+= $14
                }

                END
                {
                for ( key in cnt  )
                { print key"~`~"cnt[key]"~`~"sum[key];}
                }
'
                > $dirname"rts/"$filename.temp

Here, cnt is the counter and sum is for summing particular column
values corresponding to the pattern. On running the script, the memory
usage starts increasing initially from a few hundred MBs, later to
even more than 10GB.
OS is centos5 and version are GNU Awk 4.1.1, API: 1.1.
All the awk versions below this also face the same problem.

Regards
Anurag Dubey

[Prev in Thread]

Current Thread

[Next in Thread]

[bug-gawk] [BugReport] GAWK 4.1.1 memory leak, Anurag Dubey, 2014/12/22
- [bug-gawk] [BugReport] GAWK 4.1.1 memory leak, Anurag Dubey <=
  - Re: [bug-gawk] [BugReport] GAWK 4.1.1 memory leak, Anurag Dubey, 2014/12/22
    - Re: [bug-gawk] [BugReport] GAWK 4.1.1 memory leak, Aharon Robbins, 2014/12/24

Prev by Date: [bug-gawk] [BugReport] GAWK 4.1.1 memory leak
Next by Date: Re: [bug-gawk] Problem with printing 5000 lines to a coprocess
Previous by thread: [bug-gawk] [BugReport] GAWK 4.1.1 memory leak
Next by thread: Re: [bug-gawk] [BugReport] GAWK 4.1.1 memory leak
Index(es):
- Date
- Thread