[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: why there are multiple functions for CDF of beta distribution?
From: |
Vasu Jaganath |
Subject: |
Re: why there are multiple functions for CDF of beta distribution? |
Date: |
Thu, 9 Jan 2020 21:45:10 -0700 |
After quite a bit of trial and error, I can fairly confidently say that,
gsl_cdf_beta_Pinv is definitely worse than scipy.stats.beta.ppf
for ppf, scipy uses a smirnov inverse function.
I fiddled around with the implementation of gsl_cdf_beta_Pinv, quite a bit
with different parameters and tried to manually set the failing case to x,
0, 1, 0.5, mean .. nothing worked. (Basically get FPE at run time)
This is sadly very inadequate for my purposes and I don't want to dig deep
and implement a better algorithm for inverse function,
for now I will calculate CDF numerically and use linear interpolation to
numerically calculate it's inverse, That's pretty pedestrian way to go
about it.
I am open to other suggestions.
Thanks,
Vasu
On Thu, Jan 9, 2020 at 6:30 PM Vasu Jaganath <address@hidden>
wrote:
> Martin and others,
>
> Following up, I have a particular example for which both Q and P variants
> of inverse functions fail.
>
> where as scipy beta.ppf (equivalent to gsl_cdf_beta_Pinv) does the right
> thing and converges to 1 or nearest double.
>
> I am attaching a very simple example with exact same data, the gsl
> function fails where as scipy function does the right thing.
>
> Any thoughts? any workarounds? Maybe there is a way I can specify
> convergence criteria?
>
> Thanks,
> Vasu
>
>
> On Wed, Jan 8, 2020 at 12:42 PM Vasu Jaganath <address@hidden>
> wrote:
>
>> Thanks Martin,
>>
>> I will test it out.
>>
>> On Tue, Jan 7, 2020 at 11:16 PM Martin Jansche <address@hidden>
>> wrote:
>>
>>> There are many more floating point values between 0.0 and 0.001 than
>>> there are between 0.999 and 1.0. The difference between 1.0 and the next
>>> smaller double value is only around 1e-16, but the next larger double value
>>> after 0.0 is about 1e-303. So beta_P(0.9, 1, 17) will be necessarily
>>> equivalent to 1.0 due to lack of precision, whereas beta_Q(0.9, 1, 17) will
>>> be 1e-17. (Haven't tried this in GSL. You may want to try and report back.)
>>>
>>> On Wed, Jan 8, 2020 at 1:35 AM Vasu Jaganath <address@hidden>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> This is probably a very silly question,
>>>>
>>>> I don't understand why there are two separate P and Q variants for
>>>> CDFs?
>>>> particularly for beta distribution?
>>>>
>>>>
>>>> https://www.gnu.org/software/gsl/doc/html/randist.html#the-beta-distribution
>>>>
>>>> Thanks,
>>>> Vasu
>>>>
>>>