pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: regression lib


From: Jason H. Stover
Subject: Re: regression lib
Date: Sun, 1 May 2005 11:43:07 -0400
User-agent: Mutt/1.4.2.1i

I got started on a regression lib. You can find it
at 

        www.sakla.net/linreg.tar.bz2

Let me know if it looks offensive. I just dropped it into lib/ and
compiled it. It doesn't contain much yet, but I thought I should give
people a chance to critique its design before going much further.

I called it 'linreg' because 'regression' could mean 'non-linear
regression'. I also created a struct which can contain a lot of
relevant information about estimation for a linear model, including
coefficients, residuals, sums of squares and whatever else becomes
necessary later.  That information can be passed to other procedures,
making extra data passes unnecessary for some analyses.

On this topic of caching statistics: It would be nice if pspp_linreg()
could accept as an argument the means and standard deviations of all
model variables. That would eliminate the need for pspp_linreg() to
pass through the data to get those values. Under this design, when
pspp_linreg() gets a mean and/or std. dev. for a variable in the
model, it will not compute that mean/std. dev. again. If it doesn't
get the mean/std. dev. for a variable in the model, it will compute
that mean/std. dev.

If some PSPP procedure had already computed means/std. dev.'s by the
time pspp_linreg() is called, can PSPP pass those values to
pspp_linreg()? If so, where does PSPP store that information? What
structure should I look in to figure this all out? I see the variable
structure contains information about a variable like its label and
number of values. Can it also contain a variable's mean and standard
deviation?

-Jason

On Tue, Apr 26, 2005 at 10:35:46AM -0700, Ben Pfaff wrote:
> Jason Stover <address@hidden> writes:
> 
> > I started writing some backend regression routines, with the intent of
> > creating a regression procedure. Since least-squares fitting is a
> > cliche in statistics, these routines should be as general-purpose as
> > possible, so I created a library in lib/regression.  I haven't gotten
> > far yet: just sweep.c, regression.h and Makefile.am, but
> > it does compile to libregression.a.
> >
> > Before I go any farther, is this an acceptable approach?
> > libregression.a will know almost nothing about PSPP, but different
> > PSPP procedures will be able to call its functions 'easily'.  There
> > are a lot of computational routines for linear models.  If possible,
> > such routines should be separated from the particularities of the many
> > PSPP procedures that eventually will need those computations. (A
> > semi-independent regression library might be useful in other
> > programs.)
> 
> I think that a library designed to be easily used by PSPP, but
> separable from it, is a reasonable candidate for lib/.  Your
> approach sounds good to me.
> -- 
> Ben Pfaff 
> email: address@hidden
> web: http://benpfaff.org
> 
> 
> _______________________________________________
> pspp-dev mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/pspp-dev
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]