commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <>
Subject Re: [math] Large-Scale Optimization
Date Mon, 29 Jul 2013 11:28:37 GMT

> Sorry for the long problem description.
> I implemented a radial basis function network for non-linear
> regression with adaptive centers and adaptive basis shapes (diagonal
> covariance matrix) using the Levenberg-Marquardt solver
> (org.apache.commons.math3.optim.nonlinear.vector.jacobian.LevenbergMarquardtOptimizer)
> and the ModelFunction's
> DerivativeStructure[] value(DerivativeStructure[] x)
> function using the DerivativeStructure API so that the derivatives 
> are
> computed analytically.
> For a reasonable sized network with 200 radial bases, the number of
> parameters is
> (200 /* # bases */ +1 /* bias */ +((dim /* center of 1 basis */ + dim
> /* shape parameters of 1 basis */)*200))
> where "dim" is the dimension of the input vectors. This results in a
> few hundred free parameters. For small amounts of data, everything
> works fine. But in problems with high-dimensional input, I sometimes
> use tens of thousands (or even hundreds of thousands) of training
> samples. Unfortunately, with this much training data, I receive 
> either
> a Java Heap Error or a Garbage Collection Error (in the middle of
> differentiation).
> The main problem seems to be that the optimizer expects the
> ModelFunction to return a vector evaluating all of the training
> samples to compare with the Target instance passed in as
> OptimizationData. For regular evaluation this isn't to much of a
> problem, but the memory used by the DerivativeStructure instances
> (spread out over a few hundred parameters times 10,000 evaluations) 
> is
> massive.

I am not sure I understand what you mean by "times 10000 evaluations":
Only a few evaluations (2, I think, for the LM algorithm) are kept in
memory at each iteration (then discarded at the next iteration).

I think that the "DerivativeStructure" is pretty much optimized (if you
store only what you really need).

The problem, as you indicate, probably comes for the large number of
observations ("target"), which are obviously required by the large
number of parameters.

> Is there any way to get the solver to evaluate the residuals/gradient
> incrementally?

The LM algorithm uses the Jacobian matrix whose number of entries is
the product of the number of elements in "target" and the number of
parameters. IIUC, what you suggest amounts to change the algorithm
(so that it would use only part of the observations).

Could you perhaps try the "NonLinearConjugateGradientOptimizer"?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message