Understanding lzma-302eos

lzip-bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Understanding lzma-302eos

From:	Hoël Bézier
Subject:	Understanding lzma-302eos
Date:	Tue, 23 Aug 2022 22:15:25 +0200

Hi,

I’ve recently decided to learn the Hare language, and figured that implementinglzip support for it would be a good way to start.

Reading the ietf draft regarding the lzip format, the source code of lzd andfilling holes with the wikipedia page on lzma, I managed to grasp (most)of the process going on when decompressing lzip — compression will certainly beanother challenge, but I’ll see when I get to it.

One thing (amongst many!) that I fail to figure out is why the range decoderskips the first five bytes of the lzma stream. This happens in theRange_decoder constructor in lzd code:


```cpp
  Range_decoder() : member_pos( 6 ), code( 0 ), range( 0xFFFFFFFFU )
    {
    for( int i = 0; i < 5; ++i ) code = ( code << 8 ) | get_byte();
    }
```

This is also confirmed by the ietf draft:
   The range encoder produces a first 0 byte that must be ignored by the
   range decoder.  This is done by shifting 5 bytes in the
   initialization of 'code' instead of 4.

This tells me why it should skip five bytes instead of four, but why do we needto skip four bytes in the first place, that I cannot understand. I guess I’mmissing some more general knowledge about range encoding, which is why I’msending this email in the hope that some of you might enlighten me.

On a side note, this code snippet shows that the first five bytes are used toupdate the code, which is the current point in the range, according to the ietfdraft, but range is not updated. I don’t understand why, and this tells me I donot properly understand what these variables represent. Any insight is welcomeon that matter too.


Thanks,
Hoël

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

Understanding lzma-302eos, Hoël Bézier <=
- Re: Understanding lzma-302eos, Antonio Diaz Diaz, 2022/08/23

Next by Date: Re: Understanding lzma-302eos
Next by thread: Re: Understanding lzma-302eos
Index(es):
- Date
- Thread