[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Zutils-bug] zgrep performance long line

From: Walter Anema
Subject: [Zutils-bug] zgrep performance long line
Date: Wed, 15 Aug 2018 15:47:17 +0000

Hi Antonio,


You made a nice package with z utilities.

I am using this in a docker container (Alpine) and try to analyse JSON logging.


I have a problem with the performance of a special file. It is a file with logging in json format, without a \n.
I need to append an `echo` before `wc` shows up with a count.


(zcat /logs/s3/2018/04/11/08/prod-kinesis-firehose-stream-1-2018-04-11-08-05-23-bcdf3841-52b5-47eb-bf85-c36dfa2d0d55;echo ) | wc

      1 2145643 37786248


Somehow the zgrep takes a long time:

# /usr/bin/zgrep -V

zgrep (zutils) 1.7

Copyright (C) 2018 Antonio Diaz Diaz.

License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.

# time (/usr/bin/zgrep -o connect largefile_with_one_json_line| wc)

     97      97     776

real   0m19.320s

user   0m19.317s

sys    0m0.078s


When I use GNU zgrep it is 20 times faster:

# zgrep -H

zgrep (gzip) 1.5

Copyright (C) 2010-2012 Free Software Foundation, Inc.

This is free software.  You may redistribute copies of it under the terms of

the GNU General Public License <http://www.gnu.org/licenses/gpl.html>.

There is NO WARRANTY, to the extent permitted by law.


Written by Jean-loup Gailly.


# time (/usr/bin/zgrep -o connect largefile_with_one_json_line | wc)

     97      97     776

real   0m0.830s

user   0m0.964s

sys    0m0.044s


Can you explain the difference?


Best regards,


Walter Anema

Technisch Applicatie Beheer


be smart. get connected.



Blaak 16 3011 TA Rotterdam The Netherlands

+31 (0)88 625 25 37 +31 (0)6 54 32 76 70




Op dit bericht is de e-mail disclaimer van Portbase van toepassing.

Please consider the environment before printing this e-mail.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]