bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32073: Improvements in Grep


From: Jim Meyering
Subject: bug#32073: Improvements in Grep
Date: Fri, 6 Jul 2018 17:33:08 -0700

On Fri, Jul 6, 2018 at 9:26 AM, Sergiu Hlihor <address@hidden> wrote:
> Hello,
>      I'm using grep over Ubuntu Server 14.04 (Grep version 2.16). While
> grepping over large files I've noticed Grep is painfully slow. The
> bottleneck seems to be the read block which is extremely low (looks like
> 64KB). For large files residing over big HDD RAID arrays, this request
> barely reaches one drive and based on CPU usage, grep is idling more or
> less. Given my tests for such scenarios, a read block size of at least
> 512KB would be way more efficient. It's very likely that optimum would be
> 1MB+. Also, such increase in buffer size would also benefit slightly SSDs
> where maximum sequential throughput is usually achieved when reading at
> 256KB+ block size.
>      If this is already possible in newer versions or configurable, I'd
> appreciate some hints about the new version which contains or about the way
> I can configure it to increase the read block size.

Thanks for raising the issue.
This makes me think we should follow Coreutils' lead[0] and increase
grep's initial buffer size from 32KiB, probably to 128KiB. I will time
with the attached diff on a few systems.

[0] 
https://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=v8.22-103-g74ca6e84c

Attachment: grep-bufsize-increase.diff
Description: Binary data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]