bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

uniq Bug


From: Ryan Helinski
Subject: uniq Bug
Date: Tue, 27 Jun 2006 19:36:18 -0400
User-agent: Thunderbird 1.5.0.4 (Windows/20060516)

Hello,

Not sure if this has already been discovered, but I found a problem with uniq. If I sat down and looked a the code, I could probably see how to fix it. It seems to always occur with very large unsorted streams (files).

Below are the commands I ran to exploit the bug (which I originally thought was my error). Sorting the stream before removing duplicate lines is inconsistent with just removing duplicate lines:

address@hidden srv]# find ./ -printf "%i\n" -type f > ./srv_inodes.txt
address@hidden srv]# cat srv_inodes.txt | wc -l
65678
address@hidden srv]# cat srv_inodes.txt | uniq | wc -l
65488
address@hidden srv]# less srv_inodes.txt
address@hidden srv]# cat srv_inodes.txt | sort | uniq | wc -l
57046

Note that srv_inodes.txt as generated is about 70 thousand inode numbers. I've attached this file.

Let me know the status of this bug (or limitation),

Ryan Helinski

Attachment: srv_inodes.zip
Description: Zip compressed data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]