poke-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Byte endianess vs Bit endianess


From: egeyar
Subject: Re: Byte endianess vs Bit endianess
Date: Mon, 18 Nov 2019 16:57:31 +0100

Hi John!
Hi Jose, as we'll need your input here too!

On Mon, Nov 18, 2019 at 3:33 PM John Darrington <address@hidden> wrote:
Further to my discussions with José on IRC ...

Perhaps there needs to be some more powerful way of dealing with
situations where byte-endianess is distinct from bit-endianess.  As
José says bit-endianess is (almost) always big endian.

Always, in current ios_read_(u)int.
 
  That is to
say, that the MSB of a byte appears earlier in memory than the LSB (it
would be an unusual machine which did it the other way around, but who
knows ...)

If the byte-endianess is the same as the bit-endianess (ie big-endian)
there is no problem. However in the little endian case, things get
messy when dealing with bit fields which do not start/end on byte
boundaries.

Yeah tell me about it :)
 
This raised its ugly head in the context of implementing  ieee754
floating point numbers. Consider a C program fragment:

extern FILE *fp
double dp = -100;
fwrite (&dp, sizeof dp, 1, fp);

This writes a double precision floating point number to a file.  On a
big-endian machine, it'll appear in the file as:

C059 0000 0000 0000 0000

and this can be easily parsed using the Poke fragment

deftype ieee754_double = struct
{
  int<1>  sign;
  int<11> exponenent;
  int<52> fraction;
};

(ieee754_double @ NULL)

Here sign=1 (the MSB of C0), and exponent is 0x405 (the lower 7 bits
of C0 juxtaposed with the upper four bits of 59).
fraction=9 0000 0000 0000 0000
This means, - 1.10010000000000000000 (base2) x 2^6 == -100 base10  ---  No
problem.


However, on a little endian machine, the C program above will write:

0000 0000 0000 59c0

This alone shows that bit-endiannes is big. I.e. the last bit is the 56th bit, not 63rd.

However, I would expect a struct to be written with the given order. Otherwise, we open a can of worms. If we are going to write it 8 byte at a time as little endian, we have two options. First, we can read it *only* as a whole to preserve the same order. Writing a struct with unaligned bits as members in the little endian fashion will almost always screw things up as the order of bits read differently in little endian. If the ios is asked to read unaligned 8+ bits, it'll assume that the first bit read is always the 7th bit. This won't be the case while reading a struct.
 

Now it would be tempting to think that poke can parse this with

deftype ieee754_double = struct
{
  little int<52> fraction;
  little int<11> exponenent;
  little int<1>  sign;
};

Is it tempting to think this? Not to me. My intuition tells me that this should never happen. Let me know what I'm missing here please.
 
But that gives a totally wrong parse:

fraction=0x5000000000000
exponent=0x074
sign=0

I believe what's happening here, for example for "sign", is that ios is asked to read one bit at the offset 63. It returns the bit at offset 63. The ios's is not context aware, it does not know better.
 
So it seems that IOS needs to be a bit more intelligent when dealing
with little-endianess?

I think it is better to handle this at a higher level, like poke VM. What do you say Jose?

In particular, it needs to be aware that
whilst bytes might be ordered little endian, the bits within each byte
remain big-endian.

The ios always assumes this, but only for the offsets that is asked to read. Does not make assumptions about the greater context.

For example, if you ask it to read 11 bits starting from the offset 52, it assumes the bit read at offset 52 is the 7th bit. At offset 53 is the 6th bit. ... At offset 59 is the 0th bit and then at offset 60 is the 10th bit... Lastly, it reads at offset 62 the 8th bit. If you want me to change this logic, I am not sure what to replace it with.

Shall I assume that each byte I read is ordered in itself? Then the same case of reading 11 bits starting from the offset 52 would become the following:
At offset 52 we read the 3rd bit. At offset 53, we read the 2nd... At offset 55 is the bit 0.
At 56 is 10th bit ... and lastly at offset 62 is the bit 4th.

Lastly, even if I change the logic of ios, reading the 63rd bit of 0000 0000 0000 59c0 will always return 0, so we need other things to consider here too.

Regards
Egeyar


reply via email to

[Prev in Thread] Current Thread [Next in Thread]