Return-Path: X-Original-To: apmail-commons-commits-archive@minotaur.apache.org Delivered-To: apmail-commons-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EDDF91898B for ; Mon, 27 Jul 2015 20:14:39 +0000 (UTC) Received: (qmail 25612 invoked by uid 500); 27 Jul 2015 20:14:39 -0000 Delivered-To: apmail-commons-commits-archive@commons.apache.org Received: (qmail 25536 invoked by uid 500); 27 Jul 2015 20:14:39 -0000 Mailing-List: contact commits-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@commons.apache.org Delivered-To: mailing list commits@commons.apache.org Received: (qmail 25527 invoked by uid 99); 27 Jul 2015 20:14:39 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Jul 2015 20:14:39 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 82749E01FB; Mon, 27 Jul 2015 20:14:39 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: luc@apache.org To: commits@commons.apache.org Message-Id: <49a0459ab430416091ae7359095974ed@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [math] added a least squares section in the user guide Date: Mon, 27 Jul 2015 20:14:39 +0000 (UTC) Repository: commons-math Updated Branches: refs/heads/master 09fe956a6 -> 2309f28e3 added a least squares section in the user guide Project: http://git-wip-us.apache.org/repos/asf/commons-math/repo Commit: http://git-wip-us.apache.org/repos/asf/commons-math/commit/2309f28e Tree: http://git-wip-us.apache.org/repos/asf/commons-math/tree/2309f28e Diff: http://git-wip-us.apache.org/repos/asf/commons-math/diff/2309f28e Branch: refs/heads/master Commit: 2309f28e3d6b81611d793bb5b583a11a48ef4a20 Parents: 09fe956 Author: Luc Maisonobe Authored: Mon Jul 27 22:14:17 2015 +0200 Committer: Luc Maisonobe Committed: Mon Jul 27 22:14:17 2015 +0200 ---------------------------------------------------------------------- src/site/site.xml | 6 +- src/site/xdoc/userguide/exceptions.xml | 10 +- src/site/xdoc/userguide/filter.xml | 6 +- src/site/xdoc/userguide/fitting.xml | 8 +- src/site/xdoc/userguide/genetics.xml | 8 +- src/site/xdoc/userguide/index.xml | 64 +++-- src/site/xdoc/userguide/leastsquares.xml | 366 ++++++++++++++++++++++++++ src/site/xdoc/userguide/ode.xml | 12 +- src/site/xdoc/userguide/optimization.xml | 7 +- 9 files changed, 436 insertions(+), 51 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/site.xml ---------------------------------------------------------------------- diff --git a/src/site/site.xml b/src/site/site.xml index 6f45d47..b6e0f7a 100644 --- a/src/site/site.xml +++ b/src/site/site.xml @@ -67,13 +67,15 @@ - + + + - + http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/exceptions.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/exceptions.xml b/src/site/xdoc/userguide/exceptions.xml index d16781a..4376116 100644 --- a/src/site/xdoc/userguide/exceptions.xml +++ b/src/site/xdoc/userguide/exceptions.xml @@ -23,14 +23,14 @@ The Commons Math User Guide - Exceptions -
+
- + Commons Math defines a set of exceptions in order to convey the precise low-level cause of failure. - +

Starting from version 3.0, all exceptions generated by the Commons Math code are unchecked (i.e. they inherit from @@ -46,7 +46,7 @@

- +

The exceptions defined by Commons Math follow the Java standard hierarchies: @@ -88,7 +88,7 @@

- +
  • Localization

    http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/filter.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/filter.xml b/src/site/xdoc/userguide/filter.xml index a241bea..b298953 100644 --- a/src/site/xdoc/userguide/filter.xml +++ b/src/site/xdoc/userguide/filter.xml @@ -23,13 +23,13 @@ The Commons Math User Guide - Filters -

    - +
    +

    The filter package currently provides only an implementation of a Kalman filter.

    - +

    KalmanFilter provides a discrete-time filter to estimate http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/fitting.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/fitting.xml b/src/site/xdoc/userguide/fitting.xml index c7c8bb1..6caace8 100644 --- a/src/site/xdoc/userguide/fitting.xml +++ b/src/site/xdoc/userguide/fitting.xml @@ -25,8 +25,8 @@ -

    - +
    +

    The fitting package deals with curve fitting for univariate real functions. When a univariate real function y = f(x) does depend on some unknown parameters @@ -76,7 +76,7 @@ - +

    Fitting of specific functions are provided through the following classes: @@ -128,7 +128,7 @@ final double[] coeff = fitter.fit(obs.toList()); - +

    The AbstractCurveFitter class provides the basic functionality for implementing other http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/genetics.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/genetics.xml b/src/site/xdoc/userguide/genetics.xml index 49fb749..1455a8d 100644 --- a/src/site/xdoc/userguide/genetics.xml +++ b/src/site/xdoc/userguide/genetics.xml @@ -23,14 +23,14 @@ The Commons Math User Guide - Genetic Algorithms -

    - +
    +

    The genetics package provides a framework and implementations for genetic algorithms.

    - +

    GeneticAlgorithm provides an execution framework for Genetic Algorithms (GA). @@ -76,7 +76,7 @@

    - +

    Here is an example GA execution: http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/index.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/index.xml b/src/site/xdoc/userguide/index.xml index a80f40a..0dfd8e6 100644 --- a/src/site/xdoc/userguide/index.xml +++ b/src/site/xdoc/userguide/index.xml @@ -128,39 +128,43 @@

  • 12.4 Direct Methods
  • 12.5 General Case
-
  • 13. Ordinary Differential Equations Integration -
  • -
  • 14. Genetic Algorithms -
  • -
  • 15. Filters +
  • 13. Curve Fitting
  • -
  • 16. Exceptions +
  • 14. Least Squares
  • -
  • 17. Curve Fitting +
  • 15. Ordinary Differential Equations Integration +
  • +
  • 16. Genetic Algorithms +
  • +
  • 17. Filters
  • 18. Machine Learning @@ -170,6 +174,14 @@
  • 18.3 Implementation
  • +
  • 19. Exceptions + +
  • http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/leastsquares.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/leastsquares.xml b/src/site/xdoc/userguide/leastsquares.xml new file mode 100644 index 0000000..8da9e78 --- /dev/null +++ b/src/site/xdoc/userguide/leastsquares.xml @@ -0,0 +1,366 @@ + + + + + + + + + The Commons Math User Guide - Least squares + + + +
    + +

    + The least squares package fits a parametric model to a set of observed + values by minimizing a cost function with a specific form. + The fitting basically consists in finding the values + for some parameters pk such that a cost function + J = sum(wi(targeti - modeli)2) is + minimized. The various (targeti - modeli(pk)) + terms are called residuals. They represent the deviation between a set of + target values targeti and theoretical values computed from + models modeli depending on free parameters pk. + The wi factors are weights. One classical use case is when the + target values are experimental observations or measurements. +

    +

    + Two engines devoted to least-squares problems are available. The first one is + based on the + Gauss-Newton method. The second one is the + Levenberg-Marquardt method. +

    +
    + + + +

    + In order to solve a least-squares fitting problem, the user must provide the following elements: +

      +
    • a mean to evaluate all the components of the model for a given set of parameters: + modeli = fi(p1, p2, ... pk), + this is code
    • +
    • the target (or observed) components: targeti, this is data
    • +
    • the start values for all pk parameters: sk, this is data
    • +
    • optionally a validator for the pk parameters, this is code
    • +
    • optionally the weight for sample point i: wi, this is data and defaults to 1.0 if not provided
    • +
    • a maximum number of iterations, this is data
    • +
    • a maximum number of evaluations, this is data
    • +
    • a convergence criterion, this is code
    • +
    +

    +

    + The elements of the list above can be provided as an implementation of the + + LeastSquaresProblem interface. However, this is cumbersome to do directly, so some helper + classes are available. The first helper is a mutable builder: + + LeastSquaresBuilder. The second helper is an utility factory: + + LeastSquaresFactory. +

    +

    + The builder class is better suited when setting the various elements of the least squares + problem is done progressively in different places in the user code. In this case, the user + would create first an empty builder andconfigure it progressively by calling its methods + (start, target, model, ...). Once the configuration + is complete, calling the build method would create the least squares problem. +

    +

    + The factory utility is better suited when the various elements of the least squares + problem are all known at one place and the problem can be built in just one sweep, calling + to one of the static LeastSquaresFactory.create method. +

    +
    + + +

    + The model function is used by the least squares engine to evaluate the model components + modeli given some test parameters pk. It is therefore a multivariate + function (it depends on the various pk) and it is vector-valued (it has several + components modeli). There must be exactly one component modeli for + each target (or observed) component targeti, otherwise some residuals to be + squared and summed could not be computed. In order for the problem to be well defined, the + number of parameters pk must be less than the number of components modeli. + Failing to ensure this may lead to the engine throwing an exception as the underlying linear + algebra operations may encounter singular matrices. It is not unusual to have a large number + of components (several thousands) and only a dozen parameters. There are no limitations on these + numbers, though. +

    +

    + As the least squares engine needs to create Jacobians matrices for the model function, both + its value and its derivatives with respect to the pk parameters must + be available. There are two ways to provide this: +

    + The first alternative is best suited for models which are not computationally intensive + as it allows more modularized code with one method for each type of computation. The second + alternative is best suited for models which are computationally intensive and evaluating + both the values and derivatives in one sweep saves a lot of work. +

    +

    + The point parameter of the value methods in the + MultivariateVectorFunction, + MultivariateMatrixFunction, + or MultivariateJacobianFunction + interfaces will contain the parameters pk. The values will be the model components + modeli and the derivatives will be the derivatives of the model components + with respect to the parameters dmodeli/dpk. +

    +

    + There are no requirements on how to compute value and derivatives. The + + DerivativeStructure class may be useful to compute analytically derivatives in + difficult cases, but this class is not mandated by the API which only expects the derivatives + as a Jacobian matrix containing primitive double entries. +

    +

    + One non-obvious feature provided by both the builder and the factory is lazy evaluation. This feature + allows to defer calls to the model functions until they are really needed by the engine. This + can save some calls for engines that evaluate the value and the Jacobians in different loops + (this is the case for Levenberg-Marquardt). However, lazy evaluation is possible only + if the model functions are themselves separated, i.e. it can be used only with the first + alternative above. Setting up the lazyEvaluation flag to true in the builder + or factory and setting up the model function as one + MultivariateJacobianFunction + instance at the same time will trigger an illegal state exception telling that the model function + misses required functionality. +

    +
    + + +

    + In some cases, the model function requires parameters to lie within a specific domain. For example + a parameter may be used in a square root and needs to be positive, or another parameter represents + the sine of an angle and should be within -1 and +1, or several parameters may need to remain in + the unit circle and the sum of their squares must be smaller than 1. The least square solvers available + in Apache Commons Math currently don't allow to set up constraints on the parameters. This is a + known missing feature. There are two ways to circumvent this. +

    +

    + Both ways are achieved by setting up a + ParameterValidator + instance. The input of the value and jacobian model functions will always be the output of + the parameter validator if one exists. +

    +

    + One way to constrain parameters is to use a continuous mapping between the parameters that the + least squares solver will handle and the real parameters of the mathematical model. Using mapping + functions like logit and sigmoid, one can map a finite range to the + infinite real line. Using mapping functions based on log and exp, one + can map a semi-infinite range to the infinite real line. It is possible to use such a mapping so + that the engine will always see unbounded parameters, whereas on the other side of the mapping the + mathematical model will always see parameters mapped correctly to the expected range. Care must be + taken with derivatives as one must remember that the parameters have been mapped. Care must also + be taken with convergence status. This may be tricky. +

    +

    + Another way to constrain parameters is to simply truncate the parameters back to the domain when + one search point escapes from it and not care about derivatives. This works only if the + solution is expected to be inside the domain and not at the boundary, as points out of the domain + will only be temporary test points with a cost function higher than the real solution and will soon + be dropped by the underlying engine. As a rule of thumb, these conditions are met only when the + domain boundaries correspond to unrealistic values that will never be achieved (null distances, + negative masses, ...) but they will not be met when the domain boundaries are more operational + limits (a maximum weight that can be handled by a device, a minimum temperature that can be + sustained by an instrument, ...). +

    +
    + + +

    + Among the elements to be provided to the least squares problem builder or factory + are some tuning parameters for the solver. +

    +

    + The maximum number of iterations refers to the engine algorithm main loop, whereas the + maximum number of iterations refers to the number of calls to the model method. Some + algorithms (like Levenberg-Marquardt) have two embedded loops, with iteration number + being incremented at outer loop level, but a new evaluation being done at each inner + loop. In this case, the number of evaluations will be greater than the number of iterations. + Other algorithms (like Gauss-Newton) have only one level of loops. In this case, the + number of evaluations will equal to the number of iterations. In any case, the maximum + numbers are really only intended as safeguard to prevent infinite loops, so the exact + value of the limit is not important so it is common to select some almost arbitrary number + much larger than the expected number of evaluations and use it for both + maxIterations and maxEvaluations. As an example, if the least + squares solver usually finds a solution in 50 iterations, setting a maximum value to 1000 + is probably safe and prevents infinite loops. If the least squares solver needs several + hundreds of evaluations, it would probably be safer to set the maximum value to 10000 or + even 1000000 to avoid failures in slightly more demanding cases. Very fine tuning of + these maximum numbers is often worthless, they are only intended as safeguards. +

    +

    + Convergence checking is delegated to a dedicated interface from the optim + package: + ConvergenceChecker, parameterized with either the specific + Evaluation + class used for least squares problems or the general + PointVectorValuePair. + Each time convergence is checked, both the previous + and the current evaluations of the least squares problem are provided, so the checker can + compare them and decide whereas convergence has been reached or not. The predefined convergence + checker implementations that can be useful for least squares fitting are: +

    + Of course, users can also provide their own implementation of the + ConvergenceChecker + interface. +

    +
    + + +

    + Once the least squares problem has been created, using either the builder or the factory, + it is passed to an optimization engine for solving. Two engines devoted to least-squares + problems are available. The first one is + based on the + Gauss-Newton method. The second one is the + Levenberg-Marquardt method. For both increased readability and in order to leverage + possible future changes in the configuration, it is recommended to use the fluent-style API to + build and configure the optimizers. This means creating a first temporary version of the optimizer + with a default parameterless constructor, and then to set up the various configuration parameters + using the available withXxx methods that all return a new optimizer instance. Only the + final fully configured instance is used. As an example, setting up a Levenberg-Marquardt with + all configuration set to default except the cost relative tolerance and parameter relative tolerance + would be done as follows: +

    + + LeastSquaresOptimizer optimizer = new LevenbergMarquardtOptimizer(). + withCostRelativeTolerance(1.0e-12). + withParameterRelativeTolerance(1.0e-12); + + +

    + As another example, setting up a Gauss-Newton optimizer and forcing the decomposition to SVD (the + default is QR decomposition) would be done as follows: +

    + + LeastSquaresOptimizer optimizer = new GaussNewtonOptimizer(). + withwithDecomposition(GaussNewtonOptimizer.Decomposition.QR); + + +
    + + +

    + Solving the least squares problem is done by calling the optimize method of the + optimizer and passing the least squares problem as the single parameter: +

    + + LeastSquaresOptimizer.Optimum optimum = optimizer.optimize(leastSquaresProblem); + + +

    + The + LeastSquaresOptimizer.Optimum class is a specialized + Evaluation + with additional methods te retrieve the number of evaluations and number of iterations performed. + The most important methods are inherited from the + Evaluation + class and correspond to the point (i.e. the parameters), cost, Jacobian, RMS, covariance ... +

    +
    + + +

    + The following simple example shows how to find the center of a circle of known radius to + to best fit observed 2D points. It is a simplified version of one of the JUnit test cases. + In the complete test case, both the circle center and its radius are fitted, here the + radius is fixed. +

    + + final double radius = 70.0; + final Vector2D[] observedPoints = new Vector2D[] { + new Vector2D( 30.0, 68.0), + new Vector2D( 50.0, -6.0), + new Vector2D(110.0, -20.0), + new Vector2D( 35.0, 15.0), + new Vector2D( 45.0, 97.0) + }; + + // the model function components are the distances to current estimated center, + // they should be as close as possible to the specified radius + MultivariateJacobianFunction distancesToCurrentCenter = new MultivariateJacobianFunction() { + public Pair<RealVector, RealMatrix> value(final RealVector point) { + + Vector2D center = new Vector2D(point.getEntry(0), point.getEntry(1)); + + RealVector value = new ArrayRealVector(observedPoints.length); + RealMatrix jacobian = new Array2DRowRealMatrix(observedPoints.length, 2); + + for (int i = 0; i < observedPoints.length; ++i) { + Vector2D o = observedPoints[i]; + double modelI = Vector2D.distance(o, center); + value.setEntry(i, modelI); + // derivative with respect to p0 = x center + jacobian.setEntry(i, 0, (center.getX() - o.getX()) / modelI); + // derivative with respect to p1 = y center + jacobian.setEntry(i, 1, (center.getX() - o.getX()) / modelI); + } + + return new Pair<RealVector, RealMatrix>(value, jacobian); + + } + }; + + // the target is to have all points at the specified radius from the center + double[] prescribedDistances = new double[observedPoints.length]; + Arrays.fill(prescribedDistances, radius); + + // least squares problem to solve : modeled radius should be close to target radius + LeastSquaresProblem problem = new LeastSquaresBuilder(). + start(new double[] { 100.0, 50.0 }). + model(distancesToCurrentCenter). + target(prescribedDistances). + lazyEvaluation(false). + maxEvaluations(1000). + maxIterations(1000). + build(); + LeastSquaresOptimizer.Optimum optimum = new LevenbergMarquardtOptimizer().optimize(problem); + Vector2D fittedCenter = new Vector2D(optimum.getPoint().getEntry(0), optimum.getPoint().getEntry(1)); + System.out.println("fitted center: " + fittedCenter.getX() + " " + fittedCenter.getY()); + System.out.println("RMS: " + optimum.getRMS()); + System.out.println("evaluations: " + optimum.getEvaluations()); + System.out.println("iterations: " + optimum.getIterations()); + +
    + +
    + +
    http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/ode.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/ode.xml b/src/site/xdoc/userguide/ode.xml index d3b2eea..b2fe395 100644 --- a/src/site/xdoc/userguide/ode.xml +++ b/src/site/xdoc/userguide/ode.xml @@ -25,8 +25,8 @@ -
    - +
    +

    The ode package provides classes to solve Ordinary Differential Equations problems.

    @@ -115,7 +115,7 @@ double[] y = new double[] { 0.0, 1.0 }; // initial state dp853.integrate(ode, 0.0, y, 16.0, y); // now y contains final state at time t=16.0
    - +

    The solution of the integration problem is provided by two means. The first one is aimed towards simple use: the state vector at the end of the integration process is copied in the y array of the @@ -178,7 +178,7 @@ integrator.addStepHandler(stepHandler); automatic guess is wrong.

    - +

    ODE problems are continuous ones. However, sometimes discrete events must be taken into account. The most frequent case is the stop condition of the integrator @@ -256,7 +256,7 @@ public int eventOccurred(double t, double[] y, boolean increasing) { } - +

    The tables below show the various integrators available for non-stiff problems. Note that the implementations of Adams-Bashforth and Adams-Moulton are adaptive stepsize, not fixed stepsize @@ -288,7 +288,7 @@ public int eventOccurred(double t, double[] y, boolean increasing) {

    - +

    If in addition to state y(t) the user needs to compute the sensitivity of the final state with respect to the initial state (dy/dy0) or the sensitivity of the final state with respect to some parameters http://git-wip-us.apache.org/repos/asf/commons-math/blob/2309f28e/src/site/xdoc/userguide/optimization.xml ---------------------------------------------------------------------- diff --git a/src/site/xdoc/userguide/optimization.xml b/src/site/xdoc/userguide/optimization.xml index f691ffa..7790db1 100644 --- a/src/site/xdoc/userguide/optimization.xml +++ b/src/site/xdoc/userguide/optimization.xml @@ -27,7 +27,12 @@

    The contents of this section currently describes deprecated classes. - Please refer to the new API description. + Please refer to the new API + description. +

    +

    Least squares optimizers are not in this package anymore, they have been moved + in a dedicated least-squares sub-package described in the least squares + section.