Re: binocdf inaccuracy in Octave

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: binocdf inaccuracy in Octave

From:	Daniel J Sebald
Subject:	Re: binocdf inaccuracy in Octave
Date:	Mon, 08 Jul 2013 15:10:06 -0500
User-agent:	Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.24) Gecko/20111108 Fedora/3.1.16-1.fc14 Thunderbird/3.1.16

On 07/08/2013 12:33 PM, Daniel J Sebald wrote:

On 07/08/2013 12:29 PM, Daniel J Sebald wrote:

On 07/08/2013 11:38 AM, Rik wrote:

7/8/13

Dr. Klein,

I think this is actually a much easier problem to solve that it first
appeared. In the the file binocdf.m the formula used to calculate the
CDF is

cdf(k) = 1 - betainc (p, tmp + 1, n - tmp);

According to Wikipedia
(http://en.wikipedia.org/wiki/Binomial_distribution) the CDF for the
binomial distribution is

\textstyle I_{1-p}(n - k, 1 + k)

or

I(1-p, n-k, 1+k)

So it appears that we simply have the arguments wrong to the betainc
function.


But it is also true that

\textstyle I_{x}(a, b) = I_{1-x}(b, a)


Oops, make that

I_{x}(a,b) = 1 - I_{1-x}(b,a)

Dan

http://en.wikipedia.org/wiki/Regularized_incomplete_beta_function#Incomplete_beta_function



In other words, the two expressions you are comparing are mathematically
equivalent. Where is the discrepancy then? Numerical issues?

We should be a little careful here. There was a discrepancy at one ofthe extreme ends of the function. Your modification may have just movedthat discrepancy to the other end of the range, i.e., when the argumentis 1e-3 as opposed to 1-1e-3. And we aren't 100% sure which is correct(but it is looking like Octave because of the strange string of zeros atthe front), Erlang or Octave. Octave is using a FORTRAN routine:


double
betainc (double x, double a, double b)
{
  double retval;
  F77_XFCN (xdbetai, XDBETAI, (x, a, b, retval));
  return retval;
}

So, we should introduce some tests, if possible, that check resultsinherent to Octave's results. There are some tests, which I haven'texplored, as part of the betainc involving "eps".

There doesn't seem to be any tests as part of binocdf that check foraccuracy, mainly just sanity of the inputs. One or more tests along thelines of what Alex referred to should be useful. That is, compute theCDF two ways, one from binocdf and one from cumulating the probabilitymass function


p = 1-1e-3;
pval = binopdf ([0:50]', 50, p);
cumsum (pval) - binocdf ([0:50]', 50, p)

and check those results.  Here is what I'm seeing:

p = 1-1e-3;
pval = binopdf ([0:50]', 50, p);
cumsum (pval) - binocdf ([0:50]', 50, p)

ans =

   1.00000000000007e-150
   4.99510000000035e-146
   1.22260117600006e-141
   1.95424813815765e-137
   2.29399723360430e-133
   2.10841676614637e-129
   1.57977022596915e-125
   9.92031009822021e-122
   5.32697826475548e-118
   2.48350747999872e-114
   1.01724999177793e-110
   3.69550684776946e-107
   1.19987796084329e-103
   3.50394897537906e-100
    9.25151215525190e-97
    2.21822759875234e-93
    4.84771698248160e-90
    9.68615422923810e-87
    1.77409990314262e-83
    2.98511415284120e-80
    4.62253801912459e-77
    6.59738446954312e-74
    8.68836583529609e-71
    1.05672286748147e-67
    1.18770467184289e-64
    1.23406745773931e-61
    1.18550990718313e-58
    1.05282245558334e-55
    8.64033611769894e-53
    6.54884440079011e-50
    4.58011354682620e-47
    2.95231633143266e-44
    1.75142047732177e-41
    9.54507616566026e-39
    4.76856227562884e-36
    2.17814373131045e-33
    9.06845557842062e-31
    3.42871128091703e-28
    1.17213664214074e-25
    3.60414974466429e-23
    9.90534035957436e-21
    2.41464425822922e-18
   -3.79109725646771e-17
    5.02325782207112e-17
   -7.98300326864362e-18
    5.36716049465165e-17
   -1.89714998454670e-17
    2.55465136891897e-18
    1.04083408558608e-17
    1.94289029309402e-16
    4.44089209850063e-16

Rik, that string of zeros at the front of binocdf () has me wondering ifthe issue here is that the FORTRAN routine computing betainc () issingle precision float and not double precision float. Note that thebinopdf.m script uses the following computations:


    pdf(k) = exp (gammaln (n+1) - gammaln (x(k)+1) - gammaln (n-x(k)+1)
                  + x(k)*log (p) + (n-x(k))*log (1-p));

Are these all double precision operations?

Dan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: binocdf inaccuracy in Octave, Rik, 2013/07/08
- Re: binocdf inaccuracy in Octave, Daniel J Sebald, 2013/07/08
  - Re: binocdf inaccuracy in Octave, Daniel J Sebald, 2013/07/08
    - Re: binocdf inaccuracy in Octave, Daniel J Sebald <=
    - Re: binocdf inaccuracy in Octave, Pascal Dupuis, 2013/07/08
    - Re: binocdf inaccuracy in Octave, Daniel J Sebald, 2013/07/08
    - Re: binocdf inaccuracy in Octave, Daniel J Sebald, 2013/07/08
    - Re: binocdf inaccuracy in Octave, Dr. Alexander Klein, 2013/07/09
    - Re: binocdf inaccuracy in Octave, Rik, 2013/07/08
    - Re: binocdf inaccuracy in Octave, Daniel J Sebald, 2013/07/08
    - Re: binocdf inaccuracy in Octave, Rik, 2013/07/08
- Re: binocdf inaccuracy in Octave, Daniel J Sebald, 2013/07/08

Prev by Date: Re: default constructor and class
Next by Date: Re: default constructor and class
Previous by thread: Re: binocdf inaccuracy in Octave
Next by thread: Re: binocdf inaccuracy in Octave
Index(es):
- Date
- Thread