[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Problems with Statistics package
From: |
Hannes |
Subject: |
Re: Problems with Statistics package |
Date: |
Wed, 30 May 2012 10:26:14 +0200 |
User-agent: |
Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) |
There is an overflow:
octave:4> anova(samples,grps)
warning: division by zero
Note that I do not get a division by 0 warning on this example. But I
do get quite a few division by zero warnings on other examples (always
when the sample values are all identical). Because some of the results
seemed right, I just disabled the warnings, but I don't feel I can
trust the results at all right now.
One-way ANOVA Table:
Source of Variation Sum of Squares df Empirical Var
*********************************************************
Between Groups 944.4969 1 944.4969
Within Groups 0.0000 159 0.0000
---------------------------------------------------------
Total 944.4969 160
Test Statistic f Inf
p-value 0.0000
ans = 0
At least you get the desired result (pval==0), while I get the opposite...
I have looked into the source code, which states the following:
total_mean = mean (y(:));
SSB = sum (group_count .* (group_mean - total_mean) .^ 2);
SST = sumsq (reshape (y, n, 1) - total_mean);
SSW = SST - SSB;
df_b = k - 1;
df_w = n - k;
v_b = SSB / df_b;
v_w = SSW / df_w;
f = v_b / v_w;
pval = 1 - fcdf (f, df_b, df_w);
Now the place where the problematic division takes place is
f = v_b / v_w
which is variance_between by variance_within. This is the right thing
to do according to Wikipedia, but of course variance_within can quite
possibly be 0.
Now from my statistical intution the following results are expected:
v_b != 0 && v_w == 0 -> pval = 1
v_b == 0 && v_w == 0 -> pval = 0
But I am no expert on this... also I don't know what to do if v_w is
really close to 0 but not 0.
Does anyone know more?
octave:5> version
ans = 3.6.1
I updated to 3.6.1 as well, no change.
Best,
Hannes