In my view the framework should be as simple as possible.
class OptimizationFunction
{
public DiffValue value(double[] x)
}
where
class DiffValue
{
double val;
double[] gradient;
}
class DiffValueHessian
{
double val;
double[] gradient;
double[][] Hesssian;
}
or for least squares
class DiffValueLeastSquares
{
double[] values;
double[][] J;
}
this is all that any newton based optimization would need does not require large number of
new objects per evaluation, allows one to simply reuse "val" to compute "gradient", etc. If
the user wants to do automatic differentiation we should provide them a wrap function. Please
lets keep optimization function as clean and simple as possible.
On Nov 30, 2012, at 1:12 PM, Gilles Sadowski <gilles@harfang.homelinux.org> wrote:
> Hello.
>
>> As a user of the optimization algorithms I am completely confused by the change.
It seems different from how optimization function are typically used and seems to be creating
a barrier for no reason.
>
> If you think that it's for no reason, then you probably missed some
> important point: If you can express the objective function in terms of
> "DerivativeStructure" parameters, then you get all derivatives for free!
>
> Of course, that's not always easy (e.g. if the objective function is the
> result of a sizeable amount of code).
>
>>
>> I am not clear why you can't just leave the standard interface to an optimizer be
a function that computes the value and the Jacobian (in case of leastsquares), the gradient
(for quasiNewton methods) and if you actually have a full newton method, also the Hessian.
>
> Maybe we will; that's the discussion point I raised in this thread.
> IIUC, there are cases where it is indeed a barrier to force user
> into using "DerivativeStructure" although it does not bring any advantage
> (like when the gradient and Jacobian are only accessible through finite
> differences).
>
>>
>> If the user wants to compute the Jacobian (gradient) using finite differences they
can do it themselves, or wrap it into a class that you can provide them that will compute
finite differences using the desired algorithm.
>
> That's one of the points below: We could assume that a finite difference
> differentiator is outside the realm of sensible use of "DerivativeStructure"
> (and we keep the converters) or we figure out what necessary and sufficient
> features such a differentiator must have to cover all usages of CM (e.g.
> enabling the derivativebased algorithms to work).
>
>>
>> Also I can image a case when computation of the Jacobian can be sped up if the function
value is known, yet if you have two separate functions handle the derivatives and the actual
function value. For example f^2(x). You can probably derive some kind of caching scheme, but
still.
>
> If using the "forward" formula for firstorder derivative, then knowing the
> value of the function spares one function evaluation per optimized
> parameter.
>
>>
>> Maybe I am missing something, but I spend about an hour trying to figure out how
change my code to adapt to your new framework. Still haven't figured it out.
>
> You are not alone. I've spent much more than an hour, and only came with
> questions. ;)
>
>
> Regards,
> Gilles
>
>>
>> On Nov 30, 2012, at 11:11 AM, Gilles Sadowski <gilles@harfang.homelinux.org>
wrote:
>>
>>> Hello.
>>>
>>> Context:
>>> 1. A user application computes the Jacobian of a multivariate vector
>>> function (the output of a simulation) using finite differences.
>>> 2. The covariance matrix is obtained from "AbstractLeastSquaresOptimizer".
>>> In the new API, the Jacobian is supposed to be "automatically" computed
>>> from the "MultivariateDifferentiableVectorFunction" objective function.
>>> 3. The converter from "DifferentiableMultivariateVectorFunction" to
>>> "MultivariateDifferentiableVectorFunction" (in "FunctionUtils") is
>>> deprecated.
>>> 4. A "FiniteDifferencesDifferentiator" operator currently exists but only
>>> for univariate functions.
>>> Unles I'm mistaken, a direct extension to multiple variables won't do:
>>> * because the implementation uses the symmetric formula, but in some
>>> cases (bounded parameter range), it will fail, and
>>> * because of the floating point representation of real values, the
>>> delta for sampling the function should depend on the magnitude of
>>> of the parameter value around which the sampling is done whereas the
>>> "stepSize" is constant in the implementation.
>>>
>>> Questions:
>>> 1. Shouldn't we keep the converters so that users can keep their "homemade"
>>> (firstorder) derivative computations?
>>> [Converters exist for gradient of "DifferentiableMultivariateFunction"
>>> and Jacobian of "DifferentiableMultivariateVectorFunction".]
>>> 2. Is it worth creating the multivariate equivalent of the univariate
>>> "FiniteDifferencesDifferentiator", assuming that higher orders will
>>> rarely be used because of
>>> * the loss of accuracy (as stated in the doc), and/or
>>> * the sometimes prohibitively expensive number of evaluations of the
>>> objective function? [1]
>>> 3. As current CM optimization algorithms need only the gradient or
>>> Jacobian, would it be sufficient to only provide a limited (twopoints
>>> firstorder) finite differences operator (with the possiblity to choose
>>> either "symmetric", "forward" or "backward" formula for each parameter)?
>>>
>>>
>>> Best regards,
>>> Gilles
>>>
>>> [1] And this cost is somewhat "hidden" (as the "DerivativeStructure" is
>>> supposed to provide the derivatives for free, which is not true in this
>>> case).
>
> 
> To unsubscribe, email: devunsubscribe@commons.apache.org
> For additional commands, email: devhelp@commons.apache.org
>

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
