commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luc Maisonobe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-874) New API for optimizers
Date Wed, 24 Oct 2012 13:54:12 GMT

    [ https://issues.apache.org/jira/browse/MATH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13483231#comment-13483231
] 

Luc Maisonobe commented on MATH-874:
------------------------------------

Well, in fact there is not really new CM code here, only a small glue code. The code that
really changes, is user code...

What changes is how users provided the Jacobian. With the former API, the user had to provide
two interlinked implementation. An implementation of the DifferentiableMultivariateVectorFunction
interface, which itself was a mean to retrieve an implementation of the MultivariateMatrixFunction
interface. These two implementations had to be in different classes, as they both defined
a method named "value" and having a single double[] parameter, one method returning a double[]
and the other returning a double[][]. A common way to do this was to use a top level class
for one interface and an internal class for the second interface.

With the newer API, users provide a single class implementing two functions. The first function
is the same as in the former API and computes the value only. The second function is able
to merge value, the Jacobian and in fact could also provide higher order derivatives or derivatives
with respect to other variables if this function were appended after other functions.

The optimizers do handle both cases in the same way after the initialization. With the former
API, the optimizer stores a reference to both users objects (the one returning double[] and
the one returning double[][]). In the newer API, the optimizer stores a reference to the user
object and a reference to a wrapper around the user object that extract the Jacobian from
the second method. The underlying optimization engine is exactly the same.


What it means for users is the following:

* the part of user code dedicated to set up and call the optimizer is not changed at all
* the part of user code dedicated to compute the function value is not changed at all
* the part of user code dedicated to compute the function Jacobian is changed

For closed form functions, the changes to Jacobians computation is in fact a simplification.
Users are not required to apply the chain rules by themselves, they simply have to change
double variables into DerivativeStructure variables and change accordingly the +, -, * ...
operators into calls to add, subtract, multiply ...

Here is an example, reworked from the unit tests:

{code:title=FormerAPI}
public class Brown implements DifferentiableMultivariateVectorFunction {

  public double[] value(double[] variables) {
    double[] f = new double[m];
    double sum  = -(n + 1);
    double prod = 1;
    for (int j = 0; j < n; ++j) {
      sum  += variables[j];
      prod *= variables[j];
    }
    for (int i = 0; i < n; ++i) {
      f[i] = variables[i] + sum;
    }
    f[n - 1] = prod - 1;
    return f;
  }

  public MultivariateMatrixFunction jacobian() {
      return new Internal();
  }

  private class Internal implements MultivariateMatrixFunction {
    public double[][] value(double[] variables) {
      double[][] jacobian = new double[m][];
      for (int i = 0; i < m; ++i) {
        jacobian[i] = new double[n];
      }

      double prod = 1;
      for (int j = 0; j < n; ++j) {
        prod *= variables[j];
        for (int i = 0; i < n; ++i) {
          jacobian[i][j] = 1;
        }
        jacobian[j][j] = 2;
      }

      for (int j = 0; j < n; ++j) {
        double temp = variables[j];
        if (temp == 0) {
          temp = 1;
          prod = 1;
          for (int k = 0; k < n; ++k) {
            if (k != j) {
              prod *= variables[k];
            }
          }
        }
        jacobian[n - 1][j] = prod / temp;
      }

      return jacobian;

    }

  }

}
{code}

{code:title=NewerAPI}
public class Brown implements MultivariateDifferentiableVectorFunction {

  public double[] value(double[] variables) {
    double[] f = new double[m];
    double sum  = -(n + 1);
    double prod = 1;
    for (int j = 0; j < n; ++j) {
      sum  += variables[j];
      prod *= variables[j];
    }
    for (int i = 0; i < n; ++i) {
      f[i] = variables[i] + sum;
    }
    f[n - 1] = prod - 1;
    return f;
  }

  public DerivativeStructure[] value(DerivativeStructure[] variables) {
    DerivativeStructure[] f = new DerivativeStructure[m];
    DerivativeStructure sum  = variables[0].getField().getZero().subtract(n + 1);
    DerivativeStructure prod = variables[0].getField().getOne();
    for (int j = 0; j < n; ++j) {
      sum  = sum.add(variables[j]);
      prod = prod.multiply(variables[j]);
    }
    for (int i = 0; i < n; ++i) {
      f[i] = variables[i].add(sum);
    }
    f[n - 1] = prod.subtract(1);
    return f;
  }

} 
{code}

You can note that with the newer API, creating the second method (with DerivativeStructure)
from the first method (with double), is straightforward. It is mainly copy/paste then change
the variable types and fix all operators calls (and this is what Commons Nabla attempts to
do automatically at bytecode level). 
                
> New API for optimizers
> ----------------------
>
>                 Key: MATH-874
>                 URL: https://issues.apache.org/jira/browse/MATH-874
>             Project: Commons Math
>          Issue Type: Improvement
>    Affects Versions: 3.0
>            Reporter: Gilles
>            Assignee: Gilles
>            Priority: Minor
>              Labels: api-change
>             Fix For: 3.1, 4.0
>
>         Attachments: optimizers.patch
>
>
> I suggest to change the signatures of the "optimize" methods in
> * {{UnivariateOptimizer}}
> * {{MultivariateOptimizer}}
> * {{MultivariateDifferentiableOptimizer}}
> * {{MultivariateDifferentiableVectorOptimizer}}
> * {{BaseMultivariateSimpleBoundsOptimizer}}
> Currently, the arguments are
> * the allowed number of evaluations of the objective function
> * the objective function
> * the type of optimization (minimize or maximize)
> * the initial guess
> * optionally, the lower and upper bounds
> A marker interface:
> {code}
> public interface OptimizationData {}
> {code}
> would in effect be implemented by all input data so that the signature would become (for
{{MultivariateOptimizer}}):
> {code}
> public PointValuePair optimize(MultivariateFunction f,
>                                OptimizationData... optData);
> {code}
> A [thread|http://markmail.org/message/fbmqrbf2t5pb5br5] was started on the "dev" ML.
> Initially, this proposal aimed at avoiding to call some optimizer-specific methods. An
example is the "setSimplex" method in "o.a.c.m.optimization.direct.SimplexOptimizer": it must
be called before the call to "optimize". Not only this departs form the common API, but the
definition of the simplex also fixes the dimension of the problem; hence it would be more
natural to pass it together with the other parameters (i.e. in "optimize") that are also dimension-dependent
(initial guess, bounds).
> Eventually, the API will be simpler: users will
> # construct an optimizer (passing dimension-independent parameters at construction),
> # call "optimize" (passing any dimension-dependent parameters).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message