axiom-developer
[Top][All Lists]

## [Axiom-developer] [#167 Infinite Floats Domain] Float and DFloat

 From: wyscc Subject: [Axiom-developer] [#167 Infinite Floats Domain] Float and DFloat Date: Mon, 13 Jun 2005 07:49:16 -0500

Changes
http://page.axiom-developer.org/zope/mathaction/167InfiniteFloatsDomain/diff
--

??changed:
-What is the domain FLOAT if not already "infinite but bounded
What is the domain 'FLOAT' if not already "infinite but bounded

<pre>

</pre>

<hr>
>From Tim Daly Sun, 12 Jun 2005 17:17:54 -0500<br>
From: Tim Daly<br>
Date: Sun, 12 Jun 2005 17:17:54 -0500<br>
Subject: infinite floats domain<br>

you're right, of course.<br>
my mind is mush at the moment.<br>
i've been driving all week.<br>

t

<hr>
>From Bill Page, Sun, 12 Jun 2005 21:24:48 -0400<br>
Subject: infinite floats domain<br>

On June 12, 2005 6:18 PM Tim Daly wrote:

> you're right, of course.
> my mind is mush at the moment.
> i've been driving all week.

Yah, that'll do it to ya ... :) How's CMU?

Seriously, I think that there *are* some issues that
should be dealt with here. There may very well be some
situations that could make more efficient and/or effective
use of Axiom's "infinite precision" floats. I expect that
just as in the case of "infinite" integers, there may be
some better ways and some worse ways of doing calculations
with this objects.

For that matter, mathematically speaking just what is Axiom's
'Float' type? Is it formally equivalent (though obviously not
computationally equivalent) to 'FRAC INT'? How does it relate
to standard (IEEE ?) definitions of floating point? How does
it differ mathematically from the reals, c.f. 'RealClosure',
etc.

Regards,
Bill Page.

<hr>
>From wyscc Mon, 13 Jun 2005 07:00:00 -4:00

Axiom has 'DoubleFloat' (used to be called 'SmallFloat', hence 'sf.spad', I
think), and 'Float' ('float.spad'). 'DoubleFloat' is IEEE or "native precision"
(according to 'sf.spad'). Any floating point system is only, mathematically
speaking, a small subset, and not evenly distributed one for that, of the
reals, and for that matter, of the rationals. It is definitely not 'FRAC INT',
which <B>is</B> mathematically equivalent to the field of rational numbers.

In a fixed precision floating point system, there are only finitely many
numbers, so it cannot be 'FRAC INT'. In such a system, a floating point number
has the form $m \times b^e$, where $m$ is the mantissa (assume finite
precision), $b$ is the base (fixed), and $e$ is the exponent (assume also
finite precision). In practice, there is an offset when $m$ and $e$ are stored
so that $m$ is usually interpreted with a "base-point"  and the most
significant digit is normalized to be non-zero, and $e$ has an offset depending
on the machine word length and how a word is divided between $m, e$ (and the
signs, which for simplicity, we will assume are included in $m$ and $e$).

Axiom implementation uses the simplified representation of $m \times b^e$ as it
mathematically stands ($m$ is still normalized to give a unique representation,
but the base point is to the right of the least significant digit). The
intrinsic properties do not depend on the base or the word length, so let's
assume $b = 10$ for discussion purposes.

Let's say the exponent has precision one (that is, only one digit plus one sign
bit). So there are 19 exponents (from $-9$ to 9). Say the mantissa m has
precision 2 (that is, only two digits plus one sign bit). So there are 181
possible mantissas
(from $-99$ to $-10$, 0, 10 to 99). For simplicity, we will only discuss
positive numbers. So one sees that there can only be $19 \times 90$ positive
numbers. But these are not evenly distributed. There are 90 positive numbers
for ANY two successive exponents: so 90 positive numbers between 10 and 99, 90
between 100 and 990, and 90 between 1000 and 9900. It is clear that the
accuracy of a real number by a floating point number in this system becomes
less and less as the exponent goes up.
This is the source of round-off errors.

This is basically the situation in DFLOAT (on Intel processors), where $b = 2$,
$m$ has 53 bits stored in a 52 bit field (not including sign, note that in base
2, the most significant digit normalized must be 1, so no need to store it!)
and $e$ has 11 bits (including sign, ranging from $-1075$ to $971$ which
accounts for an offset of 53 because the base point location). This is exactly
equivalent to the IEEE-754 standard for 64 bit floating point. The actual
arithmetics is done via Lisp, which I assume calls the hardware.

In 'FLOAT', conceptually the infinite precision floating point system, is
basically also finite precision floating point system, with the ability to
increase precision as requested. However, this promotion is not automatic. In
other words, every 'FLOAT' number has its own precision, implicit in the length
of its mantissa. In effect, 'FLOAT' is the union of floating point systems with
various precisions, but without any coercion between them. So Tim is not
entirely wrong is saying we don't have infinite precision floating point in the
spirit of INT. The real question for the developers is then:

<B>Shall we improve 'FLOAT' or shall we implement a new domain of infinite
precision float (and if so, how, or rather, what are the specifications of such
a domain)?</B>

Example. Addition of two 'FLOAT' with different orders of magnitude will not
your point of view: if the precision of the number is important, then this is
correct. If exact arithmetic with floating point is what you want, this is not
correct.

If 'x' and 'y' are 'FLOAT', but 'y' is much larger than 'x', say the exponent
of 'y' exceeds that of 'x' (when both 'x' and 'y' are normalized) by the
current precision of 'FLOAT', then 'x+y' would be 'y', not 'x+y' with precision
extended (see routine 'plus' in 'float.spad'). Note that 68 bits is
approximately 20 decimal digits. The display for 'y' should have 9 trailing
zeros after the decimal.

\begin{axiom}
bits()$Float x:Float:=2^(-34) y:Float:=2^35 x+y \end{axiom} By the way, in interpreting correctness of displayed answers for floating point computation, there is another source of error besides round off. For example, \begin{axiom} x:Float:=sqrt(2) bits(100)$Float
z:Float:=sqrt(2)
z - x
\end{axiom}

Most people would expect the answer of 'z-x' to be '0.16887242 E-20' but this
ignores the fact that the display is converted from an internal binary
representation to a decimal one. During the conversion there is truncation
error (think of 0.1 in base 3 converted to decimal 0.3333... with output at any
finite precision). So 'x' is not what it seems, and neither is 'z'. Below, the
constants are converted to binary internally before computation, at a higher
precision than x (resp. z) to bring out the differences. We can now see that
'x' is larger than 'z'. So Axiom is correct and the expectated answer is wrong.

\begin{axiom}
x - 1.4142135623730950488
bits(120)\$Float
z - 1.4142135623730950488016887242
\end{axiom}

--