bug-gsl
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gsl] Hypergeometric distribution


From: Jussi Piitulainen
Subject: [Bug-gsl] Hypergeometric distribution
Date: 08 Mar 2004 15:35:09 +0200
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

This concerns mainly the documentation but also the actual behaviour
of gsl_ran_hypergeometric_pdf in GSL version 1.4.

The info page says p(k) = C(n_1,k) C(n_2, t-k) / C(n_1 + n_2,k). It
does not tell what the parameters are, but I believe the very last
symbol is wrong, and p(k) = C(n_1,k) C(n_2, t-k) / C(n_1 + n_2,t) is
meant. It also says the domain of k is max(0,t-n_2),...,max(t,n_1).
If "domain" means the possible values, I believe 0,...,min(t,n_1) is
better. (No definition is given for C(n_2, t-k) with t < k.)

Also, the documentation mentions an N3 where T must be meant.

Here are the values for the given expression, with N1 = 10, N2 = 20
and T = 5, computed exactly with a Scheme program and then made
floating point. Note that p(0) to p(3) are not probabilities.

(/ (* (bin 10 0) (bin 20 (- 5 0))) (bin (+ 10 20) 0)) = 15504.0
(/ (* (bin 10 1) (bin 20 (- 5 1))) (bin (+ 10 20) 1)) = 1615.0
(/ (* (bin 10 2) (bin 20 (- 5 2))) (bin (+ 10 20) 2)) = 117.93103448275862
(/ (* (bin 10 3) (bin 20 (- 5 3))) (bin (+ 10 20) 3)) = 5.615763546798029
(/ (* (bin 10 4) (bin 20 (- 5 4))) (bin (+ 10 20) 4)) = 0.1532567049808429
(/ (* (bin 10 5) (bin 20 (- 5 5))) (bin (+ 10 20) 5)) = 0.0017683465959328027
(/ (* (bin 10 6) (bin 20 (- 5 6))) (bin (+ 10 20) 6)) = 0.0
(/ (* (bin 10 7) (bin 20 (- 5 7))) (bin (+ 10 20) 7)) = 0.0
(/ (* (bin 10 8) (bin 20 (- 5 8))) (bin (+ 10 20) 8)) = 0.0
(/ (* (bin 10 9) (bin 20 (- 5 9))) (bin (+ 10 20) 9)) = 0.0
(/ (* (bin 10 10) (bin 20 (- 5 10))) (bin (+ 10 20) 10)) = 0.0

Here are the values for the expression that I think is meant. These
seem good to me.

(/ (* (bin 10 0) (bin 20 (- 5 0))) (bin (+ 10 20) 5)) = 0.10879541914024672
(/ (* (bin 10 1) (bin 20 (- 5 1))) (bin (+ 10 20) 5)) = 0.339985684813271
(/ (* (bin 10 2) (bin 20 (- 5 2))) (bin (+ 10 20) 5)) = 0.35998484274346343
(/ (* (bin 10 3) (bin 20 (- 5 3))) (bin (+ 10 20) 5)) = 0.1599932634415393
(/ (* (bin 10 4) (bin 20 (- 5 4))) (bin (+ 10 20) 5)) = 0.029472443265546714
(/ (* (bin 10 5) (bin 20 (- 5 5))) (bin (+ 10 20) 5)) = 0.0017683465959328027
(/ (* (bin 10 6) (bin 20 (- 5 6))) (bin (+ 10 20) 5)) = 0.0
(/ (* (bin 10 7) (bin 20 (- 5 7))) (bin (+ 10 20) 5)) = 0.0
(/ (* (bin 10 8) (bin 20 (- 5 8))) (bin (+ 10 20) 5)) = 0.0
(/ (* (bin 10 9) (bin 20 (- 5 9))) (bin (+ 10 20) 5)) = 0.0
(/ (* (bin 10 10) (bin 20 (- 5 10))) (bin (+ 10 20) 5)) = 0.0

However, the actual gsl_ran_hypergeometric_pdf seems to return
something else altogether, neither as documented nor what I think it
should.

kivilampi-20$ make
cc -I /home/jpiitula/local/include roska.c -static \
-L /home/jpiitula/local/lib -lgsl -lgslcblas -lm -o roska
kivilampi-20$ ./roska
gsl_ran_hypergeometric_pdf(0, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(1, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(2, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(3, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(4, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(5, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(6, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(7, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(8, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(9, 10, 20, 5) = 2.803496e-309
gsl_ran_hypergeometric_pdf(10, 10, 20, 5) = 2.803496e-309

Here is the program. I must be doing something very stupid, but I do
not see what it is:

#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <gsl/gsl_cdf.h>

int
main (void) {
  unsigned int N1 = 10;
  unsigned int N2 = 20;
  unsigned int T  = 5;
  unsigned int K;
  for (K = 0 ; K <= 10 ; ++ K) {
    printf("gsl_ran_hypergeometric_pdf(%u, %u, %u, %u)"
           " = %e\n",
           K, N1, N2, T,
           gsl_ran_hypergeometric_pdf(K, N1, N2, T));
  }
}

Here is the system:

gsl-1.4 configured with --prefix=~/local
i686
uname -rs: Linux 2.4.20-30.9.cslsmp
cc --version: cc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]