bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OT] Is od broken?


From: Eric Blake
Subject: Re: [OT] Is od broken?
Date: Wed, 11 Jun 2008 07:32:14 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[adding bug-coreutils; this was originally reported on the cygwin mailing
list]

According to Gary Johnson on 6/11/2008 1:26 AM:

Gary noticed an issue with the indentation of multi-specifier od:

$ od -t cx1  abc.txt
~   0000000   T   h   i   s       i   s       a   b   c       f   i   l   e
~           54 68 69 73 20 69 73 20 61 62 63 20 66 69 6c 65
~   0000020  \n
~           0a
~   0000021

| On 2008-06-10, Eric Blake wrote:
|> According to Gary Johnson on 6/10/2008 5:42 PM:
|> | That looks horrible!  The results are the same on my Red Hat
|> | Enterprise Linux WS release 4 box, but on both my SunOS 5.8 and
|> | HP-UX 11.11 machines I see the neatly-aligned outputs I'm used to:
|>
|> POSIX allows either behavior, and GNU coreutils has behaved that way for
|> years.
|>
|> |
|> | This looks like a defect in the upstream od code for Linux.  Should
|> | I report it (to bug DASH coreutils AT gnu DOT org), or is this
|> | misalignment of the character and hex values a new "feature"?
|>
|> Not a compliance bug, but you are certainly welcome to report it upstream
|> as an QoI enhancement request.  I've taken a first shot at looking at the
|> code, to see what it would take, and it seems a bit complicated.  GNU
|> coreutils intentionally minimizes the padding so that if only a single -t
|> option is present, all entries are aligned with only one space in between.
|> ~ Now consider 'od -to4 -tu2'; the minimal alignment for o4 is 11 and for
|> u2 is 5.  It takes two u2 entries to match o4, so that's 10 bytes.  But
|> how do you align 10 vs. 11 bytes?  Alternate between 1 and 2 spaces per u2
|> output chunk?
|
| FWIW, here's how "od -to4 -tu2 abc.txt" looks on various systems I
| have access to.
|
| On an HP 9000/785 running HP-UX 11.11:
|
|    <fwcomp1> od -to4 -tu2 abc.txt
|    0000000  000012432064563 000004032271440 000014130461440 000014632266145
|               21608   26995    8297   29472   24930   25376   26217   27749
|    0000020  000001200000000
|                2560
|    0000021
|
| On a SUNW,Sun-Fire-V240 SPARC running SunOS 5.8:
|
|    <suncomp3> od -to4 -tu2 abc.txt
|    0000000 12432064563 04032271440 14130461440 14632266145
|             21608 26995 08297 29472 24930 25376 26217 27749
|    0000020 01200000000
|             02560
|    0000021
|
| On some sort of HP PC running Red Hat Enterprise Linux WS release 4:
|
|    <whiffle> od -to4 -tu2 abc.txt
|    0000000 16332264124 04034664440 04030661141 14533064546
|            26708 29545 26912  8307 25185  8291 26982 25964
|    0000020 00000000012
|               10     0
|    0000021
|
| On some sort of Dell PC running Windows XP and Cygwin:
|
|    address@hidden ~
|    $ od -to4 -tu2 //rdfs1/garyjohn/abc.txt
|    0000000 16332264124 04034664440 04030661141 14533064546
|            26708 29545 26912  8307 25185  8291 26982 25964
|    0000020 00000000012
|               10     0
|    0000021
|
| That particular case was executed pretty well by everyone except
| Sun.  Their indentation may be deliberate, but it looks weird.  I
| would expect the Linux and Cygwin results to be the same, but I
| wanted to check anyway.
|
| Having looked at those, I think the answer to your question is that
| the minimal alignment for o4 is 11 digits + 1 space = 12 columns
| and for u2 it's 2 * (5 digits + 1 space) = 12 columns.  That isn't
| to say that it's not a messy problem in general, though.  I guess if
| o4 happened to be one digit wider, I'd opt for putting two spaces
| between each pair of u4s.
|
| Regards,
| Gary
|

I gave this more thought.  Since od enforces that each output line consume
a multiple of the lcm of input bytes per field, and it already knows how
many bytes are output per field, then it should be possible to compute the
amount of padding per field necessary to make all fields right-justified.
~ I'm working on a patch for this.

In looking at the code, there are also some simplifications when we heed
autoconf's advice that HAVE_LONG_DOUBLE is an obsolete construct (ie. all
reasonable porting targets support compilation of long double, and we use
gnulib's printf to make up for libc deficiencies in printing long double,
so that code does not need to be conditionally compiled).  I'll include a
patch for that in my series.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkhP090ACgkQ84KuGfSFAYCP/wCeKGmsrk4zQgE6vkDjEmweKEXY
UwQAn1ewwfcPZnFesGE4CvWkSy7iewI7
=F7zo
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]