pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

interactions


From: Jason Stover
Subject: interactions
Date: Tue, 25 Nov 2008 16:28:40 -0500
User-agent: Mutt/1.5.18 (2008-05-17)

I've been thinking about how to code interactions for GLM.
An "interaction", in terms of a linear model, can be thought
of as another variable whose values are combinations of values
of other variables. For example:

   var1  var2       interaction
   a     c          ac
   a     d          ad
   b     c          bc
   b     d          bd

An interaction should be represented in the covariance
matrix. Therefore its values need to be computed during the data pass
that creates the covariance matrix.

So I'm wondering how to compute the interactions. I thought of three
approaches:

1. Extending the covariance matrix struct to hold the interaction. In
this case the interaction may or may not be stored as a pointer to a
struct variable, I'm not sure. Maybe the best way here is to make an
interaction structure, and an extension of the covariance matrix
struct that knows about interactions.

2. Append the variables to "the data" during the data pass, without
adding the new variables and values to the dictionary. In this
approach, I was thinking of just creating the new variables and values
during the data pass, after calling case_data(), then storing them in
the covariance hash. They would then disappear after the data pass,
leaving only the needed information in the covariance matrix
hash. This approach has the advantage of not requiring any big change
to the covariance matrix struct. It has the possible disadvantage of
requiring me to make a variable that isn't in the dictionary and
compute its values. Are there functions to support this kind of thing
in src/data or src/libpspp (I mean without putting them in any
permanent place)? I have a vague memory of someone telling me there
were such functions, but they may be a false memory. One possible
problem with this approach is that of collisions when computing the
interaction's values. For example:

           var1   var2  interaction
           aa     c     aac
           a      ac    aac /* oops */

Then there's:

3. That much better approach that I didn't think of. 

If anyone can explain number 3 to me, please do. Otherwise, which sounds
best: 1 or 2?

-Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]