bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-grep] [bug #12660] byte-offset has an off-by-one error


From: Julian Foad
Subject: Re: [bug-grep] [bug #12660] byte-offset has an off-by-one error
Date: Tue, 12 Apr 2005 16:00:07 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b) Gecko/20050217

anonymous wrote:
URL:
  <http://savannah.gnu.org/bugs/?func=detailitem&item_id=12660>

                 Summary: byte-offset has an off-by-one error

Thanks for trying to help, but this is user error, not a bug.

                 Project: grep
            Submitted by: None
            Submitted on: Tue 04/12/2005 at 14:22

[...]
the byte offset reported when using the -b option is incorrect.  It is off by
one *per line*.  (except of course the first line, which correctly reports 0
offset). So the byte offset reported on the Nth line is incorrect by (N-1).

I reproduced this with grep v2.5 on a Solaris system and grep v2.5.1 on a
Linux system. This can be easily demonstrated. 1. create a small file. e.g. echo '0123456789abcdefghijklmnopqrstuv' > foo foo will now contain 32 bytes (including the final cr) on a unix system. (DOS will add a lf)

No, that's 32 printable characters, so 33 bytes including the final LF on a Unix system (and DOS will add a CR).

2. view foo with xxd to satisfy self that this is true. use 'xxd -c 8 foo' - this will show four rows of 16

16 whats? You mean 16 hexadecimal digits, plus the other information that it shows. And it shows a fifth row showing the final LF byte:

0000000: 3031 3233 3435 3637  01234567
0000008: 3839 6162 6364 6566  89abcdef
0000010: 6768 696a 6b6c 6d6e  ghijklmn
0000018: 6f70 7172 7374 7576  opqrstuv
0000020: 0a                   .

    (xxd will display 2 characters per byte)
3. view output of 'xxd -p -c 8 foo' -- satisfy self that this outputs only the contents of the file in rows of 16 chars

It outputs this:

3031323334353637
3839616263646566
6768696a6b6c6d6e
6f70717273747576
0a

which is four rows each of sixteen printable characters plus one LF character, and one row of two printable characters plus one LF character.

4.  pipe it to grep as follows:
'xxd -p -c 8 foo | grep -b ".*"' 5. view output - note that although there are only 16 chars per line, grep reports offsets of 0,17,34,51 ...

No, there are 17 characters per line including the LF, so the output is correct.

I have closed this bug as invalid.

Please discuss problems on this mailing list before filing a bug in the tracker.

- Julian





reply via email to

[Prev in Thread] Current Thread [Next in Thread]