commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evan Ward <>
Subject Re: [Math] Fluent API, inheritance and immutability
Date Tue, 20 Aug 2013 12:54:59 GMT
Hi all,

Sorry for the delay. I'll try switching all of the least squares package
to the new/seperable API and post the results to github. From there we
can review, discuss and then decide if it makes sense to switch the rest
of the optimizers.

On 08/15/2013 09:12 AM, Konstantin Berlin wrote:
> Hi,
> As a quick first pass, I will comment on Evans code only for now. Most
> of my comments have to do with numerical aspects and associated
> design, rather than threading.
> I like the idea of the optimize function taking in a Problem class. I
> think it can add a lot of flexibility going forward.
> First I think to make things clear, LeastSquares should be renamed to
> IterativeLeastSquares. Since for unconstrained linear least squares,
> this whole hierarchy is not correct. For one thing they don't need an
> initial guess, for another, linear least squares can factor the matrix
> before experimental data is supplied, so it can support rapid
> recomputation.

>From my background the problem is more commonly called General Least
Squares, or Non-Linear Least Squares. I shortened it to avoid (yet
another) long name. :) I don't plan to include linear least squares
problem in the new API. It seems hard to be simpler than new
QRDecomposition(A).getSolver().solve(b). :)

> There are several issues that I have with current example
> 1) While the least-squares problem now encapsulates the weights, which
> I like, for some reason the weights are explicitly used in the actual
> GaussNewtonOptimizer and probably all other least-squares optimizer. I
> think that is bad for several reasons:
> i) the actual optimizer algorithm has nothing to do with weights, so
> it logically inconsistent. It forces people who have no weights to
> still go through the actual numerical computation.
> ii) it makes it harder to maintain and update the optimizer code. Just
> think about how much cleaner it would be if weights are actually
> included in the Jacobian.
> iIi) The inclusion of weight can be really optimized inside the
> LeastSquaresProblem, where depending on the problem they might be able
> to take it more efficiently than the general approach.
> 2) All computations inside the optimizers should be done using linear
> algebra library (or is the library so bad?). Not only does it make the
> code easier to follow, it can make it a lot faster. Imagine a
> not-so-hypothetical example where the Jacobian is actually sparse
> (I know you guys are removing sparse linear algebra, but lets say it
> gets put back at some point later). Forming the Hessian from the
> Jacobian is a very expensive operation (J^T*J), but if J is sparse the
> speedup could be automatically propagated through the algorithm.

I agree with 1 and 2. You seem to have more experience with optimizers
than me; would you like to update GaussNewton.doOptimize() to assume the
weights are already included?

> 3) Speaking of forming the Hessian, the actual computation of the
> descent step should be logically and explicitly separated into two
> separate function. I think the first function could actually be moved
> to the LeastSquaresProblem (need more thought on this)
>  i) The first function will compute the step size p, where H*p = -g, H
> is the Hessian and g is the gradient. The reason for this is there are
> several ways this problem can be solved, and that should be deferred
> to a specific implementation later. For example, in a very large
> least-squares problem (or any large problem) you cannot actually form
> J^T*J (or H), and instead you solve the problem using conjugate
> gradient iterative method (which you already have in the library),
> where you compute J^T*J*x as J^T*(J*x). Again, here having the
> Jacobian be a matrix would really help future proof things. In the
> case of general non-linear optimizers you might want to support in the
> future a very important quasi-Newton algorithms BFGS or L-BFGS, were
> this step is done very differently.
> ii) Once the step size is computed, a line search or trust region
> method is used to potentially modify the step. Here also bound
> constraints can be enforced, so an optimizer that supports it can
> overwrite this function. This part should also be exposed, at least in
> the abstract classes, from which the final least-squares optimizer
> should be built.

I'm not sure I follow. Are you suggesting that the LSP class solve the
normal equations? Solving the normal equations and computing the next
step seems like the job of the "optimizer".

> Thanks,
> Konstantin

Ajo Fod wrote:
> I like the immutability idea. I wonder if there are any guidelines on when
> the performance hit because of immutability is high enough to want mutable
> objects and synchronization?
> -Ajo

For me the dividing line is copying large matrices and arrays. I try to
avoid copying the residuals and the Jacobian because they can be
megabytes in size.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message