[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-gsl] strange NLS behaviour

From: Patrick Alken
Subject: Re: [Help-gsl] strange NLS behaviour
Date: Tue, 13 Sep 2016 09:16:23 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2


Glad you're using the new interface - it is much more powerful than the old one. I will try to address your questions, though you didn't give any source code so it is difficult to diagnose exactly what's going on.

1. Since you're using a finite-difference approximation to the Jacobian, the output should say that there are no Jacobian evaluations. The counter is incremented only when your _df function is called. One thing you might try is to use centered differences to compute J instead of forward. This will result in a more accurate J (the old multifit_fdfsolver interface only allowed forward differences). It will cost you about twice as many function evaluations, but it may converge in fewer iterations due to a more accurate Jacobian matrix. Based on your "time elapsed" it looks like you can compute your residuals pretty quickly, so doubling the number of function calls may not matter much. You can set this via:


2. chisq/dof - I personally don't use this statistic since I find it difficult to interpret. chisq is the sum of squares of your residuals, which often have units. Therefore the chisq/dof depends on the units of the problem, so you can always scale your units in a way which will make chisq/dof as small or as big as you want. Perhaps it is more useful if you happen to have dimensionless residuals.

I think its better to plot the residuals themselves to determine the quality of the fit. If you see any systematic structure in the residuals then there could be problems with the fit (or with the underlying model). But you're obtaining parameter values very close to the true values so it seems ok to me.

3. Condition number of J - just curious why you think its too big? Do you have reason to suspect it should be close to 1? Since you're using a finite-difference Jacobian, that will probably make the conditioning worse - I think the conditioning of a finite difference operator goes something like O(1/h^2), so the conditioning gets worse as you reduce the step size. But in any case a condition number of 1e6 is reasonable and shouldn't affect the accuracy of the method too much, especially if you're using the QR (default) solver.

4. Error estimates - I don't see where you list the error estimates on your parameters, just a correlation matrix. Are you using the same error estimation as in the "Exponential Fit" example program? That program uses the chisq/dof to estimate the parameter error. In your case, chisq is tiny so its not surprising you end up with tiny errors. If your residuals are all really small, then it might be true that the parameter errors are also small.

So, it seems to me your program is working correctly. Are you getting different results when using the old interface (fdfsolver)?

Another thing I'd recommend is to run your program with Levenberg-Marquardt and then one of the dogleg methods. You might find that one method signficantly outperforms the other. For example, in a inverse problem I'm working on, I found that LM takes over 20 iterations to converge, while dogleg only takes 8, so its worth trying different methods.


On 09/13/2016 05:59 AM, viktor drobot wrote:
Hi, community!

I model kinetics of enzymatic reactions. There are problems for which it's
necessary to find suitable values of kinetic parameters so the real kinetic
data could be described by our model.

To facilitate my test runs I generate some "real kinetic data" by solving
the system of ODEs with known kinetic parameters and then add some Gaussian
noise. This is the reference curves. After that I try to solve the inverse
problem - find out the most appropriate values of these parameters with
initial approximation that is close enough to the true values. I use GLS
NLS routines (v 2.2.1) for that.

First of all I solve ODEs with another approximation and then find the
differences (Y[t] - y[t]) between reference points Y[t] (generated as
stated above) and the new calculated y[t], where 't' is the time point.
Usually I process several kinetic curves at once. I store these differences
in 'f' vector of function to be minimized. Because of the nature of the
problem there is no analytical Jacobian possible so I set corresponding
field of gsl_multifit_nlinear_fdf structure to NULL. In case of unweighted
NLS I multiply the diagonal elements of the covariance matrix by the
variance of the residuals. Then the NLS solver does all work for me and I
get some strange results.

The output says that there was no Jacobian evaluations. Kinetic parameters
obtained seems to be quiet right but other statistics weird enough. Look at
the example below.

InverseSolver v0.1.0
Run ended at Thu Sep  8 17:13:27 2016

Optimized parameters
Kcat = 2.4849549376350680e+01 +/- 2.3301006770134422e-01 (true value is
Km   = 2.5962066490476204e-05 +/- 1.2805709675629306e-06 (true value is
Kp   = 5.6728528418318680e-05 +/- 3.3698382714302584e-06 (true value is

Minimization details
The fit converged after 13 iteration(s)
Reason for stopping: small step size
Objective function evaluations = 69
Number of data points          = 380
Model parameters               = 3
dof                            = 377
chi                            = 2.7339109755226917e-05
chisq                          = 7.4742692220834363e-10
chisq/dof                      = 1.9825647803934845e-12
Time elapsed: 0.05957 secs

Correlation matrix of fit parameters
      Kcat        Km          Kp
Kcat  1.0000e+00  8.5928e-01  6.5048e-01
Km    8.5928e-01  1.0000e+00  9.4124e-01
Kp    6.5048e-01  9.4124e-01  1.0000e+00

Iterations history
Iter # Kcat                    Km
Kp                      cond(J)                 chi
0       2.0000000000000000e+01  2.0000000000000002e-05
2.0000000000000002e-05                     inf  3.1567955122657597e-04
1       2.7173704932869438e+01  2.6087621471921223e-05
3.0462571586587205e-05  2.9025496731552132e+06  8.3486976629633576e-05
2       2.4543231868632770e+01  2.5481466984423840e-05
4.6317587677535904e-05  3.0659026221618010e+06  4.9189276130628142e-05
3       2.4799130142472123e+01  2.5333806359622357e-05
5.3688047773966246e-05  2.1896626012014118e+06  2.7592054291596016e-05
4       2.4830032619351048e+01  2.5841007936620306e-05
5.6370153753761361e-05  2.0525957706617371e+06  2.7340238653871502e-05
5       2.4848466472445963e+01  2.5956371903529538e-05
5.6716387131234307e-05  2.0040307592909317e+06  2.7339110581465998e-05
6       2.4849535469371190e+01  2.5961998929997054e-05
5.6728382061723916e-05  1.9990772957287871e+06  2.7339109755370303e-05
7       2.4849547296667978e+01  2.5962054330720485e-05
5.6728499949112989e-05  1.9989372038075817e+06  2.7339109755230966e-05
8       2.4849547539278458e+01  2.5962056427776209e-05
5.6728505891823464e-05  1.9989391076721402e+06  2.7339109755230180e-05
9       2.4849549814327389e+01  2.5962073280806688e-05
5.6728551401861829e-05  1.9989361463323494e+06  2.7339109755228415e-05
10      2.4849550175380344e+01  2.5962072172440512e-05
5.6728543239809625e-05  1.9989390874822082e+06  2.7339109755227405e-05
11      2.4849549199258245e+01  2.5962065484919628e-05
5.6728525792610483e-05  1.9989382142781706e+06  2.7339109755227290e-05
12      2.4849549376350691e+01  2.5962066490476238e-05
5.6728528418318701e-05  1.9989384807737938e+06  2.7339109755226951e-05
13      2.4849549376350680e+01  2.5962066490476204e-05
5.6728528418318680e-05  1.9989391375842493e+06  2.7339109755226917e-05

Per-file Fischer's statistics
                   dof = 51
Number of data points = 54
          F-statistics = 3.2567463900742809e+04

                   dof = 62
Number of data points = 65
          F-statistics = 1.0932769130444922e+04

                   dof = 72
Number of data points = 75
          F-statistics = 2.2345670716799857e+04

                   dof = 84
Number of data points = 87
          F-statistics = 4.4888120828034567e+04

                   dof = 96
Number of data points = 99
          F-statistics = 3.9235545459615321e+04

chisq/dof value is too small. It's much smaller than 1. Condition number of
Jacobian is too big, much bigger than 1. Error estimates are too small
despite of some Gaussian noise in my generated data. Seems like the
calculation is unstable.

I completely stuck with these strange results. What I'm doing wrong? Any
help is greatly appreciated!

reply via email to

[Prev in Thread] Current Thread [Next in Thread]