[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#30719: Progressively compressing piped input
From: |
Mark Adler |
Subject: |
bug#30719: Progressively compressing piped input |
Date: |
Tue, 6 Mar 2018 18:11:51 -0800 |
> On Mar 6, 2018, at 1:58 PM, Garreau, Alexandre <address@hidden> wrote:
>
> Le 05/03/2018 à 14h54, Mark Adler a écrit :
>> deflate has an inherent latency that accumulates enough data in order
>> to efficiently emit each deflate block. You can deliberately flush
>> (with zlib, not gzip), but if you do that too frequently, e.g. each
>> line, then you will get lousy compression or even expansion.
>
> Even if the main repetition is being between the lines? like if 80% of
> half the line, and 70% of the other half lines are the same? like in a
> while loop with only ping and date? I thought to it as a very lazy way
> of not having to remove all the redundant output caused by the usage of
> ascii, the repetition of words or similar patterns occuring ever and
> ever.
Alexandre,
It has nothing to do with how much or how little or how often there is
repetition. It has to do with the overhead of the header of a dynamic block
that is required to describe the Huffman codes used therein. You need several
thousand symbols in order to pay for the bits required for the header.
Mark