pspp-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Covariance Matrices


From: Jason Stover
Subject: Re: Covariance Matrices
Date: Thu, 21 Aug 2008 10:28:43 -0400
User-agent: Mutt/1.5.18 (2008-05-17)

On Thu, Aug 21, 2008 at 08:29:32PM +0800, John Darrington wrote:
> On Wed, Aug 20, 2008 at 10:57:33AM -0400, Jason Stover wrote:
>      
>      I had planned to add a one-pass algorithm, but used a two-pass algorithm
>      first because, "usually", two-pass algorithms have lower relative errors 
> than
>      one-pass algorithms, according to
>      
>      "Algorithms for Computing the Sample Variance: Analysis and 
> Recommendations"
>      TF chan, G. Golub, R. LeVeque. American Statistician, v37 n3, 1983, pp. 
> 242-247.
>      
>      So I had planned to add a one-pass algorithm, based on the algorithms in 
> that
>      paper, but never got around to it.
>      
> 
>      My only suggestion is that the code in covariance-matrix.[ch] should have
>      both one- and two-pass algorithms. So maybe add covariance_accumulate()
>      and change covariance_pass_two () to incorporate your changes, but 
> passing
>      an argument like double *means to use the means.
> 
> 
> My opinion is that we should prefer speed rather than precision.  So
> all things being equal, I would use the single pass method.   However,
> in cases where there is a compelling reason to need another pass, then
> the more accurate method can be used.

I agree. Do you want to change the code in covariance-matrix.c, or should
I do it? 

> However, in many PSPP commands, the logic required to determine how
> many passes are necessary is quite nasty; it can depend on exactly
> which options are selected.   For  some time now, I've been thinking
> of a scheme where each statistic is aware of its own dependencies.
> With such a scheme, it would be possible to specify a set of
> statistics, then the minimum number of passes would be automatically
> determined, and the most accurate method for that number would be
> automatically selected.
> 
> This scheme would take a bit of thought, and a lot of recoding.  But
> if the routines to calculate these statistics have a similar
> interface, then that'll be the first step.

Sounds good.

-Jason




reply via email to

[Prev in Thread] Current Thread [Next in Thread]