bug-gsl
[Top][All Lists]

## Re: [Bug-gsl] Incorrect equation in the manual: Statistics->Autocorrelat

 From: Patrick Alken Subject: Re: [Bug-gsl] Incorrect equation in the manual: Statistics->Autocorrelation gsl_stats_lag1_autocorrelation Date: Tue, 5 Jul 2016 15:12:04 -0600 User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

Thanks for your report! I've changed the manual, using limits of [2,n] in the upper sum - I believe this is equivalent to what you wrote but it follows the code a little more closely.

Thanks,
Patrick

On 06/30/2016 03:55 PM, Kirill Sokolovsky wrote:

Dear GSL developers,


I think there is an error in equation defining the lag-1 autocorrelation in the manual (Statistics->Autocorrelation)
https://www.gnu.org/software/gsl/manual/html_node/Autocorrelation.html

Currently it is written as:

a_1 = {\sum_{i = 1}^{n} (x_{i} - \Hat\mu) (x_{i-1} - \Hat\mu)
\over
\sum_{i = 1}^{n} (x_{i} - \Hat\mu) (x_{i} - \Hat\mu)}


and since the summation starts from 1, for i=1 the numerator refers to x_{1-1} which is undefined (if x_0 is defined it will be ignored by summation in the denominator, which is strange).

I think instead the equation should be written as:


a_1 = {\sum\limits_{i = 1}^{N-1} (x_{i} - \Hat\mu) (x_{i+1} - \Hat\mu)
           \over
\sum\limits_{i = 1}^{N} (x_{i} - \Hat\mu)^2}

The error is in the manual, not in the code.


The pice of code below shows that the suggested equation is consistent with what is actually implemented in the GSL code. The difference between values returned by my naive implementation and the library implementation seems to be <1e-16 which is expected for double precision computations.

With best wishes,
Kirill Sokolovsky

//################ compare_lag1.c ################//

#include <stdio.h>
#include <math.h>

#include <gsl/gsl_statistics_double.h>

double my_lag1_autocorrelation(double * data, int N){
int i;
double u,d,mean;
mean=gsl_stats_mean( data, 1, N);
u=d=0.0;
for(i=0;i<N-1;i++)u+=(data[i]-mean)*(data[i+1]-mean);
for(i=0;i<N;i++)d+=(data[i]-mean)*(data[i]-mean);
return u/d;
}

int main(){
double r_my,r_gsl;
double data[10000];
int i,N;
N=0;
while( -1<fscanf(stdin,"%lf",&data[N]) )N++;

r_gsl=gsl_stats_lag1_autocorrelation( data, 1, N);
r_my=my_lag1_autocorrelation( data, N);

fprintf(stdout,"%lf - %lf = %lg\n",r_gsl,r_my,r_gsl-r_my);

return 0;
}