octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [OctDev] Fwd: Re: Problems with Statistics package


From: Arno Onken
Subject: Re: [OctDev] Fwd: Re: Problems with Statistics package
Date: Thu, 31 May 2012 11:11:12 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.4) Gecko/20120510 Icedove/10.0.4

On Wed, 30 May 2012 11:25:14, marco atzeri wrote:
> Hi Hannes, octave-forge mailing list is the right place for bugs and
> patches on the statistics package

anova.m is not part of the Octave Forge Statistics package. It is part
of Octave core. I am cc'ing this mail along with the patch to the Octave
maintainers list.

Arno

> -------- Original Message --------
> Subject: Re: Problems with Statistics package
> Date: Wed, 30 May 2012 11:06:42 +0200
> From: Hannes <address@hidden>
> To: address@hidden
>
> Ok, I think I got it pinned downed and solved. I have attached a patch.
>
> Heres the explenation:
>
>>  total_mean = mean (y(:));
>>  SSB = sum (group_count .* (group_mean - total_mean) .^ 2);
> is always positive [sum of squares between]
>
>>  SST = sumsq (reshape (y, n, 1) - total_mean);
> is always positive and >= SSB [sum of squares total]
>
>>  SSW = SST - SSB;
> => should always be >=0
> but it is an ill conditioned operation if SST==SSB resulting in
> possibly negative error. Note that if the error is positive, it
> doesn't matter because it will produce f == Inf and p = 0 as expected.
>
>>  df_b = k - 1;
>>  df_w = n - k;
>>  v_b = SSB / df_b;
>>  v_w = SSW / df_w;
> this can become negative, if the SSW is negative or 0 if SSW is zero
> or close to zero.
>
>
>>  f = v_b / v_w;
> this results in divide by zero if v_w is zero and results in *wrong
> result without warning* if v_w is negative.
>
>
> My patch changes this whole block to:
>
>    SSB = sum (group_count .* (group_mean - total_mean) .^ 2);
>    SST = sumsq (reshape (y, n, 1) - total_mean);
>    SSW = SST - SSB;
>    if (SSW < 0) # machine error if SSB == SSW
>      SSW = 0;
>    endif
>    df_b = k - 1;
>    df_w = n - k;
>    v_b = SSB / df_b;
>    v_w = SSW / df_w;
>    if (v_w != 0)
>      f = v_b / v_w;
>      pval = 1 - fcdf (f, df_b, df_w);
>    elseif (v_b == 0)
>      f = NaN;
>      pval = 1;
>
> 0 / 0 is  NaN and the hypothesis mus be rejected
>
>    else
>      f = Inf;
>      pval = 0;
>
> x / 0 is Inf and hypothesis must be accepted (because there is no
> variance inside the sample but there is outside
>
>    endif
>
> HTH,
> Hannes
>

Attachment: patch_anova.m
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]