pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: struct covariance


From: John Darrington
Subject: Re: struct covariance
Date: Thu, 2 Jun 2011 06:59:03 +0000
User-agent: Mutt/1.5.18 (2008-05-17)

Hi Jason, see my inline comments below:

On Wed, Jun 01, 2011 at 05:53:24PM -0400, Jason Stover wrote:
     I'm working on the type 3 sums of squares and have a question
     about the covariance struct.
     
     First: my data look like this:
     
     data list list / v0 to v2.
     begin data
     3.2 1 1
     3.1 1 1
     3.3 1 2
     3.4 1 2
     3.2 1 3
     3.3 1 3
     3.3 1 4
     3.2 1 4
     2.8 2 1
     2.9 2 1
     3.3 2 2
     3.0 2 2
     3.1 2 3
     3.2 2 3
     3.2 2 4
     3.1 2 4
     end data
     GLM v0 by v1 v2
         /INTERCEPT = include.
     
     Currently in run_glm we have:
     
     struct covariance *cov = covariance_2pass_create (cmd->n_dep_vars, 
cmd->dep_vars,...);
     
     Then later we have
     
         gsl_matrix *cm = covariance_calculate_unnormalized (cov);
     
     Later, when I need to find dimensions for another matrix, cov tells me
     there is only 1 variable represented in the struct, but cm->size1 is
     5. Shouldn't it be 1, based on the above call to
     covariance_2pass_create?

5 is what I would expect in this example.
You have 1 dependent variable (v0) and two categorical variables (v1 and v2).  
v1 has 2 
distinct values and v2 has 4 distinct values.  Therefore, the size of the 
covariance
matrix is 1 + (2 - 1) + (4 - 1) = 5
     
     Anyway, I created a list of all the variables (dependent and explanatory) 
and did this:
     
     struct covariance *cov = covariance_2pass_create (n_vars, vars,...);
     
     vars contains v0, v1 and v2. And now, cm->size1 is 7. But shouldn't it be
     3?

Assuming that you left the third argument to covariance_2pass_create as it 
currently is in glm.c
then 7 is also what I would expect.  You now have 3 dependent variables and the 
two categorical
variables as before.  Hence the size of the covariance matrix is now 3 + (2 - 
1) + (4 - 1) = 7
     
     If I'm missing something, please let me know. I need to find a ay to
     correctly count the number of dimensions in the covariance matrix,
     either from the struct or cm.
     
There is no way to tell from the covariance matrix alone how many dependent 
variables are
involved. Obviously this is simply n_vars.  If you want the number of columns 
which relate
to the categorical variables, then you can call categoricals_total.

In general cm->size = n_vars + categorical_total .

From the description of your problem, it sounds as if you need to pass some 
additional
parameters to your rountine. Either n_vars of the struct categoricals (or both).

(Note that at some point I plan to extend the notion of "categoricals" to 
include interactions.
But for now, let's concentrate on GLM examples which have no interactions.)

I hope this make sense.

John

-- 
PGP Public key ID: 1024D/2DE827B3 
fingerprint = 8797 A26D 0854 2EAB 0285  A290 8A67 719C 2DE8 27B3
See http://pgp.mit.edu or any PGP keyserver for public key.

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]