[Bug-gsl] Division by zero in "gnewton" when "f" and "fdf" differ

bug-gsl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gsl] Division by zero in "gnewton" when "f" and "fdf" differ

From:	Curran D. Muhlberger
Subject:	[Bug-gsl] Division by zero in "gnewton" when "f" and "fdf" differ
Date:	Sun, 27 Apr 2014 14:37:45 -0400

Hello GSL team,

The "gnewton" multiroot solver has a possibility of dividing by zero if the
provided functions "f" and "fdf" do not produce bitwise-identical output
given the same "x".  This results either in non-convergence with a result
of "-nan" or, when common floating point exceptions are trapped, a floating
point exception.

The division takes place on line 177 of "multiroots/gnewton.c".  This line
will be executed with "phi0=0" if the first call to "fdf" lands precisely
on the root, but the subsequent call to "f" at the same "x" yields a
residual greater than zero.  (The bug may be triggered in other
circumstances as well, but this is the case we encountered in our code.)

Why would "f" and "fdf" produce different results?  The manual states that
"fdf" provides an optimization by computing both "f(x)" and "J(x)"
simultaneously.  Some optimizations (such as subexpression elimination) can
lead to roundoff differences.  More generally, with optimizing compilers
and new instruction sets, it is difficult to guarantee bitwise-identical
results.  In this case, it was FMA instructions (both FMA4 and FMA3) that
triggered the bug, but we have also encountered reproducibility problems
related to AVX alignment.  For the sake of example, here is the
optimization that yielded roundoff differences with FMA:

    const double f0 = w*w - c1_/(h*h) - 1.0;

vs.

    const double w2 = w*w;
    const double h2 = h*h;
    const double f0 = w2 - c1_/h2 - 1.0;


I have attached a small program (bug_gnewton.c) that triggers this bug,
based on the manual's example code for multiroot solvers.  Currently, it
fails either with:

    Type: gnewton
      status = the iteration has not converged yet
      root = -nan

or, if GSL_IEEE_MODE="trap-common", with:

    GSL_IEEE_MODE="trap-common"
    Type: gnewton
    Floating point exception (core dumped)

The expected output is:

    Type: gnewton
      status = success
      root = 1.000000e+00


I can work around this bug for now (by having "fdf" defer to "f" and "df"),
but it would be great if "gnewton" could be made more robust.  We'd hate
for other clients to trigger this halfway into a multi-month simulation,
and unfortunately the other multiroot solvers have their own
division-by-zero problems (bug report coming soon).  Thank you for looking
into this, and if you have any questions, just let me know.

 - Curran

bug_gnewton.c
Description: Text Data

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-gsl] Division by zero in "gnewton" when "f" and "fdf" differ, Curran D. Muhlberger <=

Prev by Date: [Bug-gsl] Failed with output: Hydra job gnu:gsl-master:tarball on x86_64-linux
Next by Date: [Bug-gsl] Division by zero in "hybrid*" when initial guess is root
Previous by thread: [Bug-gsl] Failed with output: Hydra job gnu:gsl-master:tarball on x86_64-linux
Next by thread: [Bug-gsl] Division by zero in "hybrid*" when initial guess is root
Index(es):
- Date
- Thread