commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: [math] Large-Scale Optimization
Date Mon, 29 Jul 2013 03:27:52 GMT
I think that you probably want a SGD based optimizer, possibly with an
final step based on update averaging or L-BFGS.

A full conjugate gradient method gives good performance on small problems,
but just isn't going to scale.



On Sun, Jul 28, 2013 at 7:22 AM, Timothy Mann <mann.timothy@gmail.com>wrote:

> Hi users,
>
> Sorry for the long problem description.
>
> I implemented a radial basis function network for non-linear
> regression with adaptive centers and adaptive basis shapes (diagonal
> covariance matrix) using the Levenberg-Marquardt solver
>
> (org.apache.commons.math3.optim.nonlinear.vector.jacobian.LevenbergMarquardtOptimizer)
> and the ModelFunction's
> DerivativeStructure[] value(DerivativeStructure[] x)
> function using the DerivativeStructure API so that the derivatives are
> computed analytically.
>
> For a reasonable sized network with 200 radial bases, the number of
> parameters is
>
> (200 /* # bases */ +1 /* bias */ +((dim /* center of 1 basis */ + dim
> /* shape parameters of 1 basis */)*200))
>
> where "dim" is the dimension of the input vectors. This results in a
> few hundred free parameters. For small amounts of data, everything
> works fine. But in problems with high-dimensional input, I sometimes
> use tens of thousands (or even hundreds of thousands) of training
> samples. Unfortunately, with this much training data, I receive either
> a Java Heap Error or a Garbage Collection Error (in the middle of
> differentiation).
>
> The main problem seems to be that the optimizer expects the
> ModelFunction to return a vector evaluating all of the training
> samples to compare with the Target instance passed in as
> OptimizationData. For regular evaluation this isn't to much of a
> problem, but the memory used by the DerivativeStructure instances
> (spread out over a few hundred parameters times 10,000 evaluations) is
> massive.
>
> Is there any way to get the solver to evaluate the residuals/gradient
> incrementally?
>
> Thank you for the advice in advance.
>
> -Timothy A. Mann
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message