Re: [Help-gsl] use of recurrence relation while computing mean, variance
Juan Pablo Amorocho D. |
Re: [Help-gsl] use of recurrence relation while computing mean, variance etc. |
Wed, 10 Aug 2011 17:08:30 +0200 |
Hi Brian and Awhan,
just a quick follow-up on the recurrence formula of the variance. Awhan, if
you are interested in a deeper understanding of this issue and have access
to the following book, I would recommend you take a look at Higham,
N.*Accuracy and Stability of Numerical Algorithms
*. Second Edition. SIAM, 2002. Section 1.9 is called "Computing the Sample
Variance" .
Cheers,
-- Juan
> I didn't write the code, but I think it's safe to say that the
> recurrence relations and use of long double is indeed motivated by
> numerical concerns. Running sums are a common cause of overflow.
> Variables inside a loop have local scope but otherwise behave
> similarly as other variables. IMO this coding style is cleaner, and
> it also makes parallelism simpler to implement.
> Brian
>
> > hello all,
> >
> > i have 3 questions to bother you with.
> >
> > 1) what is the motivation behind using the recurrence relation for the
> > computation of the mean ? seems to me that a division by the number of
> > elements *after* the for loop will result in fewer division operations
> > than the current implementation. i have noticed the use of recurrence
> > type computation of variance as well.
> >
> > 2) many functions have a return type of double but the quantity of
> > interest that is to be returned is declared as a long double inside
> > the function body. why is this done? is it that conversion from a long
> > double to double results in less loss of precision?
> >
> > 3) declaration of variables inside the for loop body e.g. delta in the
> > following snippet from variance_source.c
> >
> > /* find the sum of the squares */
> > for (i = 0; i < n; i++)
> > {
> > const long double delta = (data[i * stride] - mean);
> > variance += (delta * delta - variance) / (i + 1);
> > }
> >
> > does the compiler keep creating a local delta each time control enters
> > the loop or is it created once but treated as a local variable and
> > valid only in the scope of the for loop?
>
>