[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] Non-linear Multi-parameter Least-Squares Example

From: Gordan Bobic
Subject: Re: [Help-gsl] Non-linear Multi-parameter Least-Squares Example
Date: Fri, 20 Jul 2007 09:30:14 +0100 (BST)

On Fri, 20 Jul 2007, John Gehman wrote:

> In using integers from 0 to 40 for your xdata, and computing data  
> with (b/c) as part of your function, you're introducing an undefined  
> number into the whole process. Hence the residual will be undefined,  
> and the sum of the squares of the residuals will be undefined, and  
> the gradients at that point, etc etc. At a minimum try float f = i 
> +1.0 where the data is built, and in the _f and _df functions.

Yeah, somebody else pointed it out to me yesterday. I can't believe I 
didn't spot that. What a schoolboy error. Worse, I only re-wrote it in the 
period/t instead of frequency format because I was finding it easier to 
read while I was getting it to work.

I now changed it using the frequency, so it both runs faster (due to 
multiply being roughly 4 clock cycles instead of about 80 for a divide) 
and avoids the div/0 problems. :-)

> That's surely the biggest problem. But you also might want to be  
> careful about mixing your float data types with the doubles assumed  
> by gsl structures. I didn't look closely, but was left with the  
> impression that they were mixed a bit in your code. You might be  
> getting away with it anyway, but just a word of caution.

The only thing that seems to require doubles is the x_init array. 
Everything else seems to work OK with floats. My main motivations for 
using floats are:

1) Smaller - more data fits in the cache
2) Faster
2.1) SSE1 cannot vectorize doubles
2.2) SSE2+ can vectorize doubles but only 2 at a time instead of 4 at a 

> Also, with regard to the ultimate intent of fitting the data, I  
> presume this was just an incremental step in your development of the  
> fitting procedure, but insofar as this goes, once you clean up the  
> above, you'll still get a shite fit, as the harmonic function, is not  
> described well by the data sampling.

Indeed I am aware of this, but it depends on the data, and it depends on 
the initial guesstimate. I have a reasonably good guesstimator 
implemented, and I am finding that typically the error between the fit and 
the data falls to within 5-10% in most cases. But this may just mean that 
my data is quite clean (which, arguably, I expected). But you do have a 
point - my worst case data may well be much more noisy.

Which brings me to the next question. I have my own home-brewed solver 
implemented and that works by steepest descent annealing (i.e. when it 
finds a local minimum it ups the resolution). I'm finding that overall it 
fits better solutions - possibly because noise in the data somtimes upsets 
the algebraic gradient descent and it gets stuck in a local minimum, and 
the local-inspection approach with increasing precision may be more 
robust. The GSL implementation also seems slower on my current platform, 
despite generally managing to descend in considerably fewer iterations (up 
to the point where it stops being able to fit a better solution - but it 
stops earlier than my own algorithm).

Now, granted, the speed could be an implementation issue (my code was made 
with ICC's auto-vectorization in mind), but are there parameters on the 
solver function that could be tweaked, other than the delta checker 

Also, how does sigma fit into all this? I have tried it with various 
settings and it didn't seem to make any difference, so I just set it to 1 
and removed it from all the equations to save the seemingly needless 

Finally - where is gsl_matrix_float_set() implemented? grep only seems to 
find it in the header files. I wanted to have a closer look at the code to 
see if I could do anything to help it vectorize with ICC (assuming it 
doesn't already).

Thank you all for helping, I really appreciate it.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]