pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: linreg


From: Jason Stover
Subject: Re: linreg
Date: Tue, 31 Jan 2006 09:38:52 -0500
User-agent: Mutt/1.4.2.1i

On Tue, Jan 31, 2006 at 02:12:28PM +0800, John Darrington wrote:
> On Mon, Jan 30, 2006 at 09:38:17AM -0500, Jason Stover wrote:
>      I originally meant lib/linreg to be a general purpose linear
>      regression library, ignorant of variable structures and other features
>      specific to PSPP. But that approach made regression.q too complicated,
>      so I made the linreg library aware of variable/value structure and
>      xnmalloc. That makes coding other procedures that use linreg
>      easier. The approach also makes linreg more dependent on PSPP, but
>      only through the variable/value structures and xnmalloc.
>      
>      I intend to use linreg in other procedures, too. I'm not sure where it
>      belongs in the new directory tree. Maybe there should be a 'statlib'
>      library with subdirectories for statistical libraries; something
>      eventually designed to make contributing easier for statistical
>      programmers who do not know about PSPP internals. 
> 
> 
> I think we need to decide whether or not linreg is going to be pspp
> aware. If it is, then it can go in a 'statlib' library, otherwise it
> can stay where it is.

It could stay where it is if the code could be written in a pspp-ignorant
way, without making the calling procedures too complicated. I don't know
how to do that.

> 
> If the model is chosen right, then the API can be designed without
> making things overly complicated for code which uses it.  It does
> however mean extra work getting it right.   I had a brief look at the
> code in linreg, and it seems that it's only coefficients.c which is
> very closely dependent upon other parts of pspp.

I have been thinking about this. I can think of several types of
models, including localized regression, generalized linear models and
others, where the the model code needs to know about pspp via the
coefficients only.  Classification and regression trees are the only
outstanding example I can think of offhand which violate this rule. 

There are enough models like this to justify a 'statlib' directory 
full of models of this type, which do not know about pspp, and a
single module available to them for coefficient data structures, which
know about pspp's variable and value structures.

Comments?

> 
> Are there currently any tests for the regression command?

No. I have been testing with my own 'fake' data.

-Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]