grep-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

How to profile the grep engine despite the "/dev/null optimization" ?


From: alexandre . ferrieux
Subject: How to profile the grep engine despite the "/dev/null optimization" ?
Date: Sat, 27 May 2023 23:20:02 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:91.0) Gecko/20100101 Thunderbird/91.10.0

Hello,

In some circumstances, it is useful to profile, or do perf anaylsis on, the grep engine.
This happens for example when you need to tune your regexps for performance.
In the old times, we just did:

    time grep PATTERN < FILE > /dev/null

But nowadays, grep *detects* /dev/null and behaves differently in that case: it sets the done_on_match flag, and accordingly exits very quickly, on the first matching line of your hundreds-of-gigabytes test file....

It's not immediately obvious why grep tries to be "smarter than the human", as "-q" is an explicit way for the human to request this behavior.
Anyway, I realize I'm pretty late to complain, as this happened 7 years ago:

    af6af28 Paul Eggert     Sun May 1 22:56:39 2016 -0700           grep: /dev/null output speedup

So, what is today's recommended idiom to do the same ?

Thanks in advance,

-Alex


PS: Note "grep ... | cat > /dev/null" is a VERY poor approximation, as the scheduler's backpressure hits pretty bad. Even with enlarged pipe buffers, grep runs slower with a pipe than with a redirection to a RAMdisk file (/dev/shm/foo), which unfortunately is not scalable to hundreds of gigabytes on most machines.

PS2: I am aware I can fool grep's detection method, which is to compare inodes, by creating a "/dev/null2" with same device number (but different inode). However, I dearly hope one doesn't need to resort to such horrendous hacks for simple perf tuning...
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou 
falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been 
modified, changed or falsified.
Thank you.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]