commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Neidhart <thomas.neidh...@gmail.com>
Subject Re: [math] fluent/builder API for algorithms in 4.0 ?
Date Tue, 30 Apr 2013 09:11:34 GMT
+1, looks really nice.

Would like to do the same for the Clusterer interface.


On Mon, Apr 29, 2013 at 6:40 PM, Luc Maisonobe <luc@spaceroots.org> wrote:

> Hi all,
>
> Since 2.x series, we have been struggling in several areas with respect
> to algorithms API. The latest change was about optimizer, but it is only
> one example among others (solvers, integration, ODE and maybe some parts
> of statistics may be concerned by the proposal below).
>
> The various things we want to keep and which are not always compatible
> with each others are :
>
>  1) simple use
>  2) immutability
>  3) good OO design
>  4) compatible with reference algorithms implementations
>  5) maintainable
>  6) extensible
>  7) backward compatibility
>  8) probably many other characteristics ...
>
> 3) and 4) often don't work together. 1) 6) and 7) are difficult to
> handle at once.
>
> If we look at optimizers, some progress have been with optimizers with
> respect to extensibility and backward compatibility, but simple use was
> clearly left behind as it is difficult to know which optimizer support
> which feature as neither strong typing nor fixed arguments are used
> anymore. However, keeping the older API would have prevented
> extensibility as the combinatorial explosion of arguments increases as
> features are added (and we still need to add several constraints types).
>
> If we look at ODE solvers, we are still using the original API from
> mantissa, but when we add a new feature, we add more and more setters,
> thus going farther and farther from immutability, and imposing some
> unwritten scheduling between calls (for example when we set up
> additional equations, we must also set up the initial additional state,
> and the user must set up a way to retrieve the final additional state).
>
> If we look at solvers, we started with some parameters set up during the
> call to solve while other were set up at construction time, but this
> repartition has changed along time.
>
> So I would like to suggest a new approach, which has been largely
> inspired by a recent discussion on the [CSV] component about the builder
> API (see <http://markmail.org/thread/o3s2a5hyj6xh4nzj>), by an older
> discussion on [math] about using fluen API for vectors (see
> <http://markmail.org/message/2gmg6wnpm5p2splb>), and by a talk Simone
> gave last year at ApacheCon Europe. The idea is to use fluent API to
> build progressively the algorithm adding features one at a time using
> withXxx methods defined in interfaces.
>
> As an example, consider just a few features used in optimization:
> constraints, iteration limit, evaluations limits, search interval,
> bracketing steps ... Some features are used in several optimizers, some
> are specific to univariate solvers, some can be used in a family of
> solvers ... Trying to fit everything in a single class hierarchy is
> impossible. We tried, but I don't think we succeeded.
>
> If we consider separately each features, we could have interfaces
> defined for each one as follows:
>
>   interface Constrainable<T extends Constrainable<T>>
>             extends Optimizer {
>     /** Returns a new optimizer, handling an additional constraint.
>       * @param c the constraint to add
>       * @return a new optimizer handling the constraint
>       * (note that the instance itself is not changed
>       */
>     T withConstraint(Constraint c);
>   }
>
> Basically they would  be used where OptimizationData is used today.
> An optimizer that supports simple bounds constraints and max iterations
> would be defined as :
>
>   public class TheOptimizer
>          implements Optimizer,
>                     Constrainable<TheOptimizer>,
>                     IterationLimited<TheOptimizer> {
>
>     private final int maxIter;
>     private final List<Constraint> constraints;
>
>     // internal constructor used for fluent API
>     private TheOptimizer(..., int maxIter, List<Constraint> list) {
>       ...
>       this.maxIter     = m;
>       this.constraints = l;
>     }
>
>     public TheOptimizer withConstraint(Constraint c) {
>       List<Constraint> l = new ArrayList<Constraint>(constraints);
>       l.add(c);
>       return new TheOptimizer(..., maxIter, l);
>     }
>
>     public TheOptimizer withMaxIter(int maxIter m) {
>       return new TheOptimizer(..., m, constraints);
>     }
>
>  }
>
> So basically, the withXxx are sort-of setters, but they do preserve
> immutability (we do not return this, we return a new object). It is easy
> to add features to existing classes and there is no need to shove
> everythin within a single hierarchy, we have a forest, not a tree. When
> looking at the API, users clearly see what the can use and what they
> cannot use: if an optimizer does not support constraint, there will be
> no way to put a constraint into it. If in a later version constraints
> become available, the existing functions will not be changed, only new
> functions will appear.
>
> Of course, this creates a bunch of intermediate objects, but they are
> often quite small and the setting part is not the most
> computation-intensive one. It becomes also possible to do some
> parametric studies on some features, using code like:
>
>   Algorithm core = new Algorithm().withA(a).withB(b).withC(c);
>   for (double d = 0.0; d < 1.0; d += 0.001) {
>     Algorithm dSpecial = core.withD(d);
>     double result = dSpecial.run();
>     System.out.println(" d = " + d + ", result = " + result);
>   }
>
> This would work for someone considering feature A is a core feature that
> should be fixed but feature D is a parameter, but this would equally
> well work for someone considering the opposite case: they will simply
> write the loop the other way, the call to withD being outside of the
> loop and the call to withA being insided the loop.
>
> A side effect is also that it becomes possible to copy safely algorithms
> by just resetting a feature, even when we don't really know what
> implementation we have. A typical example I have that creates problems
> to me is duplicating an ODE solver. It cannot be done currently, as some
> specific elements are required at construction time that depend on the
> exact type of solver you use (tolerance vectors for adaptive stepsize
> integrators). So if for example I want to do some Monte-Carlo analysis
> in parallel and need to duplicate an integrator,
> I would do it as follows:
>
>   void FirstOrderIntegrator[]
>       duplicate(FirstOrderIntegrator integrator, int n) {
>     FirstOrderIntegrator copies = new FirstOrderIntegrator[n];
>     for (int i = 0; i < n; ++i) {
>       copies[i] =
>         integrator.withMaxEvaluations(integrator.getMaxEvaluations());
>     }
>     return copies;
>   }
>
> This kind of API could be extended to several algorithms, so it may be
> set up in a consistend way accross the library. As I wrote at the
> beginning of this message, I first think about root solvers, optimizers
> and ODE.
>
> What do you think?
> Luc
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message