lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Implement loading of XML actuarial tables.


From: Greg Chicares
Subject: Re: [lmi] Implement loading of XML actuarial tables.
Date: Tue, 29 May 2012 20:23:01 +0000
User-agent: Mozilla/5.0 (Windows NT 5.1; rv:10.0.2) Gecko/20120216 Thunderbird/10.0.2

On 2012-05-23 15:35Z, Vaclav Slavik wrote:
[...]
> +    inline bool almost_equal_doubles(double a, double b)
> +    {
> +        return std::abs(a - b) < 0.00000001;
> +    }

I'm replacing this with a test that requires |relative error| <= 2.0E-15 .
All tests pass at that tolerance. The curious thing is that some tests fail
with half that tolerance (1.0E-15): value_cast<> should give fifteen digits
of decimal accuracy, so I'd naively expect results within 1.0E-15 .

Here's a "failure" using some tight tolerance that I didn't record:

                       0000000001111111        [tens]
                       1234567890123456        [units]
                    0.045513944584838698926    xml
                    0.045513944584838747498    soa
    <value age="39">0.0455139445848387</value> line in xml file

That's really close to the reciprocal of DBL_EPSILON (0.450359962737049596e16)
with the decimal point moved one position to the left. When I saw that, I began
to wonder whether there might be a flaw in value_cast<>, especially because
there's a unit test that fails with como for this most difficult decimal number:

    //                                             000000000111111111
    //                                             123456789012345678
    BOOST_TEST_EQUAL(16, floating_point_decimals(0.450359962737049596));
    BOOST_TEST_EQUAL(16, floating_point_decimals(0.4503599627370495));
    BOOST_TEST_EQUAL(16, floating_point_decimals(0.4503599627370496));
    // TODO ?? Fails for como with mingw, but succeeds with 0.45036 .
    BOOST_TEST_EQUAL(15, floating_point_decimals(0.4503599627370497));

However, in practice actuarial tables are stored with finite precision, so the
value above is strange:
                       0000000001111111        [tens]
                       1234567890123456        [units]
                    0.045513944584838698926    xml
                    0.045513944584838747498    soa
Now, value_cast<> by design gives only fifteen decimal digits of accuracy for
type double, which would be:
                    0.0455139445848387
and the "xml" value is *closer* to that than the "soa" value. But it's a weird
table in that it stores unrounded numbers, and that's because it's a "sample"
table that I made up: such a thing would never be used in the real world.

Let's look at another example, which used a 2.0E-16L tolerance (which is a
little tighter than DBL_EPSILON, so some "failures" are almost assured):
  0.00054111999999999996908 xml: |error| = ...3092
  0.0005411200000000000775  soa: |error| = ...775
  0.0000000000000002003626131884802 relative error
Assuming those values bracket the "true" value of 0.00054112, we'd prefer the
"xml" value because it's more accurate. Why would the less-good "soa" value
ever have been used? Well...this table (sample.dat) dates from 2003 at the
latest, and may indeed be five years older than that...so it's fairly likely
that the tools used to create it were built with a different compiler than gcc,
with a different C runtime.

(Here are two other value-pairs with an unrecorded tight tolerance:
  0.0005274499999999999458  xml: |error| = ...542
  0.00052745000000000005422 soa: |error| = ...542
  0.0064509499999999995665  xml: |error| = ...4335
  0.0064509500000000004338  soa: |error| = ...4338
In both pairs, the "xml" and "soa" values are about equally close to the "true"
decimal value. I'm not sure those pairs mean anything; I record them here only
because it's a shame to throw away a measurement.)

Anyway, we have the paradoxical result that changing the file format actually
"improves" the accuracy of some regression-test results. Three of the first
four discrepancies in initial target premium seem "better" because they'd
probably be whole-dollar amounts if calculated exactly:
  "xml"                   "soa"
  905                     904.990000000000009095
  4074.98999999999978172  4075
  995000                  994999.989999999990687
  1054700                 1054699.98999999999069
"Three out of four" isn't the strongest statistic, but I've looked at many
other discrepancies and believe this pattern is pervasive: "xml" values are
generally more accurate than "soa".

Apparently the explanation is that the tools used to create the "soa" tables
don't choose the closest floating-point approximations to the intended finite-
precision decimal fractions, but conversion using value_cast<> discards the
very slight errors and yields a closer approximation.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]