[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#32073: Improvements in Grep (Bug#32073)

From: Sergiu Hlihor
Subject: bug#32073: Improvements in Grep (Bug#32073)
Date: Thu, 2 Jan 2020 02:03:58 +0100

Hi Jim,
The system for which this hurts me the most is an Ubuntu 14.04 where I'd
need to run it as a separate binary. As I'm not familiar with the way it's
built, is there any guidelines of how to build it from sources? I'd happy
build it with ever larger block sizes and test.

On Thu, 2 Jan 2020 at 01:51, Jim Meyering <address@hidden> wrote:

> On Wed, Jan 1, 2020 at 12:04 PM Sergiu Hlihor <address@hidden> wrote:
> > Paul, I have to correct you. On a production server you have usually a
> mix of applications many times including databases. For databases, having a
> read ahead means one IO less since usually database access patterns are
> random reads. Here actually best is to disable completely read ahead. In
> fact, I do have to say that probably best is to disable completely read
> ahead and let applications deal with it, either in an automatic fashion,
> like reading the optimal IO block size from device  or in a configurable
> way with defaults good enough for today's servers. If you now configure the
> OS to do a read ahead hitting all HDDs then you induce potentially
> unnecessary IO load for all applications which use it, which when having
> HDDs is totally unacceptable. That's why the best is to be application
> specific and ideally configured to use optimal IO block size.
> >
> > So no, letting OS to do it is stupid.
> >
> > On Wed, 1 Jan 2020 at 20:42, Paul Eggert <address@hidden> wrote:
> >>
> >> On 1/1/20 1:15 AM, Sergiu Hlihor wrote:
> >> > If you rely on OS, then
> >> > you are at the mercy of whatever read ahead configuration you have.
> >>
> >> Right, and whatever changes you make to the OS and its read-ahead
> configuration
> >> will work for all applications, not just for 'grep'. So, change the OS
> to do
> >> that. There shouldn't be a need to change 'grep' in particular (or 'cp'
> in
> >> particular, or 'awk' in particular, etc.).
> >>
> >> > The issue of large
> >> > block sizes for IO operations is widespread across all tools from
> Linux,
> >> > like rsync or cp and its only getting worse
> >>
> >> Quite right. And it would be painful to have to modify all those tools,
> and to
> >> maintain those modifications. So modify the OS instead. Scheduling
> read-ahead is
> >> really the OS's job anyway.
> Hi Sergiu,
> If you would like to help make grep use larger buffer sizes, please
> run and report benchmarks measuring how much of a difference it would
> make, at least for your hardware. Here are some of the tests I ran to
> justify raising it from ~32k to ~96k:
> https://lists.gnu.org/archive/html/grep-devel/2018-10/msg00002.html

reply via email to

[Prev in Thread] Current Thread [Next in Thread]