[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: contribution and cvs

From: Jason Stover
Subject: Re: contribution and cvs
Date: Thu, 1 Sep 2005 19:11:44 +0000
User-agent: Mutt/

I was fixing the recoding for the categorical variables and
have a question about auxiliary data. Ben wrote:

> I think that you might have missed one of the mechanisms that
> PSPP has for adding `hooks' into `struct variable'.  The `aux'
> and `aux_dtor' members of `struct variable' are set aside for
> procedures to tag variables with auxiliary data.  The idea is
> that you attach a structure with var_attach_aux(), then access it
> as needed.  Later you can clear the aux data with
> var_clear_aux(), or you can clear it from the entire dictionary
> with dict_clear_aux().

I would like to append the recoded binary vectors to the
corresponding variable structure, but the comment in var-atr.c says:

/* Assign auxiliary data AUX to variable V, which must not
   already have auxiliary data.  Before V's auxiliary data is
   cleared, AUX_DTOR(V) will be called. */

So I recommend we include a gsl_matrix * with binary entries in the
definition of struct variable. This gsl_matrix * will hold permanently
the recoded values. Here is why I want to do this:

Recoding a categorical variable's values as a vectors with binary
entries is a basic necessity for most statistical procedures which
use categorical data. PSPP must pass the data once to recode
those values, so it would be nice if the struct variable held those
binary vectors, even after the procedure that created them exits, thereby
making the vectors available to the next procedure. There would be one
binary vector per distinct value.

But, by the comment above, v->aux can hold the binary vectors only until
someone else needs to hold other auxiliary data.

The code I wrote before did not add anything to the struct variable,
but to make it work I had to create a struct
recoded_categorical_array. The recoded_categorical_array is cumbersome
and would be unnecessary if the variable values could be stored inside
the struct variable.  So may I/we/someone add a gsl_matrix * to the
definition of struct variable? Doing so will make a lot of numerical
routines easier to write.

I'm not set on using a gsl_matrix * as the way to recode the values.
There are other, equivalent ways, if anyone prefers them instead.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]