[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [OctDev] Fwd: Re: Problems with Statistics package
From: |
Arno Onken |
Subject: |
Re: [OctDev] Fwd: Re: Problems with Statistics package |
Date: |
Thu, 31 May 2012 11:11:12 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:10.0.4) Gecko/20120510 Icedove/10.0.4 |
On Wed, 30 May 2012 11:25:14, marco atzeri wrote:
> Hi Hannes, octave-forge mailing list is the right place for bugs and
> patches on the statistics package
anova.m is not part of the Octave Forge Statistics package. It is part
of Octave core. I am cc'ing this mail along with the patch to the Octave
maintainers list.
Arno
> -------- Original Message --------
> Subject: Re: Problems with Statistics package
> Date: Wed, 30 May 2012 11:06:42 +0200
> From: Hannes <address@hidden>
> To: address@hidden
>
> Ok, I think I got it pinned downed and solved. I have attached a patch.
>
> Heres the explenation:
>
>> total_mean = mean (y(:));
>> SSB = sum (group_count .* (group_mean - total_mean) .^ 2);
> is always positive [sum of squares between]
>
>> SST = sumsq (reshape (y, n, 1) - total_mean);
> is always positive and >= SSB [sum of squares total]
>
>> SSW = SST - SSB;
> => should always be >=0
> but it is an ill conditioned operation if SST==SSB resulting in
> possibly negative error. Note that if the error is positive, it
> doesn't matter because it will produce f == Inf and p = 0 as expected.
>
>> df_b = k - 1;
>> df_w = n - k;
>> v_b = SSB / df_b;
>> v_w = SSW / df_w;
> this can become negative, if the SSW is negative or 0 if SSW is zero
> or close to zero.
>
>
>> f = v_b / v_w;
> this results in divide by zero if v_w is zero and results in *wrong
> result without warning* if v_w is negative.
>
>
> My patch changes this whole block to:
>
> SSB = sum (group_count .* (group_mean - total_mean) .^ 2);
> SST = sumsq (reshape (y, n, 1) - total_mean);
> SSW = SST - SSB;
> if (SSW < 0) # machine error if SSB == SSW
> SSW = 0;
> endif
> df_b = k - 1;
> df_w = n - k;
> v_b = SSB / df_b;
> v_w = SSW / df_w;
> if (v_w != 0)
> f = v_b / v_w;
> pval = 1 - fcdf (f, df_b, df_w);
> elseif (v_b == 0)
> f = NaN;
> pval = 1;
>
> 0 / 0 is NaN and the hypothesis mus be rejected
>
> else
> f = Inf;
> pval = 0;
>
> x / 0 is Inf and hypothesis must be accepted (because there is no
> variance inside the sample but there is outside
>
> endif
>
> HTH,
> Hannes
>
patch_anova.m
Description: Text Data
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: [OctDev] Fwd: Re: Problems with Statistics package,
Arno Onken <=