bug#22357: grep -f not only huge memory usage, but also huge time cost

bug-grep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#22357: grep -f not only huge memory usage, but also huge time cost

From:	JQK
Subject:	bug#22357: grep -f not only huge memory usage, but also huge time cost
Date:	Fri, 11 Mar 2016 14:05:47 +0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 03/11/2016 01:26 AM, Jim Meyering wrote:
> On Thu, Mar 10, 2016 at 3:00 AM, JQK <address@hidden> wrote:
>> If in the following situation,
>>
>> ===========
>> file1 has numbers from 1 to 200000, 200000 lines
>> file2 has several lines(about 200 ~300lines) of random numbers in the
>> range of 1-200000
>> ===========
>>
>> The time cost for finishing the following command could be over 15
>> minutes on linux -- a little huge.
>>
>> $ grep -v -f file1 file2
>>
>> (FYI, on AIX it could only be less than 1 second)
>>
>> Maybe there is also a room for optimization not only on the memory usage
>> but also on the time cost.
> 
> What version of grep are you using?
> With the latest (grep-2.23), this takes
> less than 1.5s on a core-i7-4770S-based system:
> 
>   $ env time grep -v -f <(seq 200000) <(shuf -i 1-200000 -n 250)
>   1.27user 0.16system 0:01.43elapsed 100%CPU (0avgtext+0avgdata
> 839448maxresident)k
>   0inputs+0outputs (0major+233108minor)pagefaults 0swaps
> 


Sorry.
In my situation, the grep command could be a little different, the
command is:

# grep -w -f file1 file2

Also after testing with the latest grep-2.23, it could slow.

# env time grep -w -f <(seq 200000) <(shuf -i 1-200000 -n 250)

-- 
Junkui Quan (JQK)
www.redhat.com

signature.asc
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

bug#22357: grep -f not only huge memory usage, but also huge time cost, JQK, 2016/03/10
- bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/03/10
  - bug#22357: grep -f not only huge memory usage, but also huge time cost, JQK <=
    - bug#22357: grep -f not only huge memory usage, but also huge time cost, Jim Meyering, 2016/03/11
    - bug#22357: grep -f not only huge memory usage, but also huge time cost, JQK, 2016/03/14
    - bug#22357: grep -f not only huge memory usage, but also huge time cost, Norihiro Tanaka, 2016/03/14
    - bug#22357: grep -f not only huge memory usage, but also huge time cost, Bruce Dubbs, 2016/03/14

Prev by Date: bug#22357: grep -f not only huge memory usage, but also huge time cost
Next by Date: bug#22357: grep -f not only huge memory usage, but also huge time cost
Previous by thread: bug#22357: grep -f not only huge memory usage, but also huge time cost
Next by thread: bug#22357: grep -f not only huge memory usage, but also huge time cost
Index(es):
- Date
- Thread