octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A problem with range objects and floating point numbers


From: Daniel J Sebald
Subject: Re: A problem with range objects and floating point numbers
Date: Thu, 25 Sep 2014 10:46:28 -0500
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16

On 09/25/2014 12:54 AM, Oliver Heimlich wrote:
Am 24.09.2014 20:06, schrieb Daniel J Sebald:
On 09/24/2014 12:29 PM, Oliver Heimlich wrote:

[snip]

That's true, but for binary numbers.  My point was that no matter the
number representation system, the underlying arithmatic logic unit (ALU)
should have mathematical consistency.  That is, if the ALU carries out
an operation, the result should be the equivalent of what is expected in
mathematics, number representation aside.  I'm wondering if there is
consistency in hardware architecture.  It may not matter that "0.1"
(which actually equals 0.10000000000xxx) doesn't equal 1/10, so long as
the following is true:

If you want to have mathematical consistency, which means no difference
between the internal representation of numbers, then (at least) two
problems arise: (1) You want to do decimal arithmetic and not binary
arithmetic. (2) You want to have infinite precision for your
(intermediate) results.

I'm not sure (1) is true, or perhaps what you mean by it.  If you mean

0.1 = ... + 0*10^-2 + 1*10^-1 + 0*10^0 + ...

I don't see that as a necessary requirement for division to behave consistently in order to use base-10 format limits and step sizes.

For (2), you might be referring to something that appears in the reference Markus pointed to. Something like

"The IEEE standard requires that the result of addition, subtraction, multiplication and division be exactly rounded. That is, the result must be computed exactly and then rounded to the nearest floating-point number (using round to even)."

Is that correct? I don't think that implies infinite precision. The Goldberg reference hints at how to use guard bits to carry out the computation and meet the standard requirement.


IEEE 754 is a very popular standard and implemented both in software and
hardware. As long as you are fine with binary floating point arithmetic
of finite precision (mostly 64 bit) you will see consinstency amongst
all standard compliant systems and get a decent performance.

That's good to know.


octave-cli:1>  rem(-2,.1) == 0
ans =  1

See the definition of rem(x,y): x - y .* fix (x ./ y)

The division exactly results in 20. The relative error of 0.1 is too
small and is lost. Then you multiply 0.1 with 20. Again, the result is
rounded and you get exactly what you wanted.

Well, it exactly results in whatever the machine representation for 20 is. Nonetheless, I'm starting to get the feeling that division and multiplication behave a little better under floating point than do addition and subtraction.


octave-cli:2>  rem(0,.1) == 0
ans =  1

Any inaccuracy is lost when you divide 0 by anything.

Try it with some numbers that we know can be represented exactly with
base 10 or base 2:

octave-cli:2>  (pi/pi) == 1.0
ans =  1

This is because x/x==1 with any x. I do not have to emphasize that both
πs are equal.

octave-cli:3>  ((50*pi)/pi) == 50.0
ans =  1
octave-cli:4>  ((pi*pi)/pi) == pi
ans =  1

Both 50 and the π constant are binary floating point numbers, so the
results may be exact. Additionally, the π constant's very last binary
digits are zero, so there is some protection against errors. Try the
following:

octave:1>  x = pi + eps * 2;
octave:2>  x * 50 / 50 == x
ans = 0

So if the user chooses something like the following:

[pi+eps*2:pi:5*pi+eps]

that can't be "integerized".


The approach I'm pondering is if for

[A:s:B]

A/s is a whole number (by machine representation) and B/s is a whole number, is [A/s:1:B/s]*s, where the first range is done using integer ranging, better/worse than [A:s:B] as done by the floating-point ranging algorithm. What are the good points and what are the bad points?

From what you pointed out, we know

octave-cli:34> (2+eps)/0.1 == 20
ans =  1

so if the user were to input

[2+eps:0.1:10+eps]

the algorithm would treat this as factorable and generate [20:1:100]*0.1. In other words, the user would get the same as if he or she entered

[2:0.1:10]

Is that a bad result? (Personally, I don't think that is a real bad result because it is saying the discrepancy in the limit w.r.t. the number representation is insignificant to the step size.) If so, how about the slightly more stringent requirements:

Given [A:s:B], if

  1) A is integer, and
  2) A/s is integer

then [A:s:B] can be implemented as

  [A/s:1:floor(B/s)] * s

where the range operation is integer-based.

The requirements are more stringent than originally posed, but still looser than A integer and s integer, which Octave currently recognizes.

Dan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]