[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug-gawk] Apparently buggy associative array behaviour
From: |
Blaise LI |
Subject: |
[bug-gawk] Apparently buggy associative array behaviour |
Date: |
Tue, 1 Mar 2016 15:45:52 +0000 |
Making the histogram of one column in a large file, I came across a case
that looks like a bug with awk.
Counting using sort | uniq -c gives several values for 5-th column of my
file:
$ awk '{print $5}' awk_bug_test.txt | sort | uniq -c
60906306 0
6342558 1
16874518 3
74186425 50
But using an associative array within awk only reports the counts for
one of the values:
$ awk '{hist[$5]++} END {for (score in hist); print hist[score],score}'
awk_bug_test.txt
74186425 50
Am I mis-using awk or is this really a bug ?
I'm using awk version "GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU
MP 6.0.0)" on debian.
The file is huge (53G). So I cannot attach it to this mail.
- [bug-gawk] Apparently buggy associative array behaviour,
Blaise LI <=