[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#32073: Improvements in Grep (Bug#32073)
From: |
arnold |
Subject: |
bug#32073: Improvements in Grep (Bug#32073) |
Date: |
Wed, 01 Jan 2020 04:19:22 -0700 |
User-agent: |
Heirloom mailx 12.5 7/5/10 |
As a quite serious question, how is someone writing user-level code
supposed to be able to figure out the right buffer size for a particular
file, and to do so portably? ("Show me the code.")
Gawk bases its reads on the st_blksize member in struct stat. That will
typically be something like 4K - not nearly enough, given your description
below.
Arnold
Sergiu Hlihor <address@hidden> wrote:
> This topic is getting more and more frustrating. If you rely on OS, then
> you are at the mercy of whatever read ahead configuration you have. And
> read ahead is typically 128KB so does not help that much. A HDD RAID 10
> array with 12 disks and a strip size of 128KB reaches the maximum read
> throughput if read block size is 6 * 128 = 768KB. When issuing read
> requests with 128KB , you only hit one HDD, having 1/6 read throughput.
> With flash the same. A state of the art SSD that can do 5GB/s reads can
> actually do around 1GB/s or less at 128KB block size. Why is so hard to
> understand how hardware works and the fact that you need huge block sizes
> to actually read at full speed? Why not just exposing the read buffer size
> as a configurable parameter, then anyone can just tune it as needed? 96KB
> is purely retarded.
>
> On Wed, 1 Jan 2020 at 08:52, Paul Eggert <address@hidden> wrote:
>
> > > This makes me think we should follow Coreutils' lead[0] and increase
> > > grep's initial buffer size from 32KiB, probably to 128KiB.
> >
> > I see that Jim later installed a patch increasing it to 96 KiB.
> >
> > Whatever number is chosen, it's "wrong" for some configuration. And I
> > suppose
> > the particular configuration that Sergiu Hlihor mentioned could be tweaked
> > so
> > that it worked better with grep (and with other programs).
> >
> > I'm inclined to mark this bug report as a wishlist item, in the sense that
> > it'd
> > be nice if grep and/or the OS could pick buffer sizes more intelligently
> > (though
> > it's not clear how grep and/or the OS could go about this).
> >
- bug#32073: Improvements in Grep (Bug#32073), Paul Eggert, 2020/01/01
- bug#32073: Improvements in Grep (Bug#32073), Sergiu Hlihor, 2020/01/01
- bug#32073: Improvements in Grep (Bug#32073),
arnold <=
- bug#32073: Improvements in Grep (Bug#32073), Sergiu Hlihor, 2020/01/01
- bug#32073: Improvements in Grep (Bug#32073), arnold, 2020/01/01
- bug#32073: Improvements in Grep (Bug#32073), Sergiu Hlihor, 2020/01/01
- bug#32073: Improvements in Grep (Bug#32073), arnold, 2020/01/02
- bug#32073: Improvements in Grep (Bug#32073), Sergiu Hlihor, 2020/01/02
- bug#32073: Improvements in Grep (Bug#32073), arnold, 2020/01/02
bug#32073: Improvements in Grep (Bug#32073), Paul Jackson, 2020/01/01
bug#32073: Improvements in Grep (Bug#32073), Paul Eggert, 2020/01/01