Hi all,
Sorry for the delay. I'll try switching all of the least squares package
to the new/seperable API and post the results to github. From there we
can review, discuss and then decide if it makes sense to switch the rest
of the optimizers.
On 08/15/2013 09:12 AM, Konstantin Berlin wrote:
> Hi,
>
> As a quick first pass, I will comment on Evans code only for now. Most
> of my comments have to do with numerical aspects and associated
> design, rather than threading.
>
> I like the idea of the optimize function taking in a Problem class. I
> think it can add a lot of flexibility going forward.
>
> First I think to make things clear, LeastSquares should be renamed to
> IterativeLeastSquares. Since for unconstrained linear least squares,
> this whole hierarchy is not correct. For one thing they don't need an
> initial guess, for another, linear least squares can factor the matrix
> before experimental data is supplied, so it can support rapid
> recomputation.
>From my background the problem is more commonly called General Least
Squares, or NonLinear Least Squares. I shortened it to avoid (yet
another) long name. :) I don't plan to include linear least squares
problem in the new API. It seems hard to be simpler than new
QRDecomposition(A).getSolver().solve(b). :)
>
> There are several issues that I have with current example
>
> 1) While the leastsquares problem now encapsulates the weights, which
> I like, for some reason the weights are explicitly used in the actual
> GaussNewtonOptimizer and probably all other leastsquares optimizer. I
> think that is bad for several reasons:
> i) the actual optimizer algorithm has nothing to do with weights, so
> it logically inconsistent. It forces people who have no weights to
> still go through the actual numerical computation.
> ii) it makes it harder to maintain and update the optimizer code. Just
> think about how much cleaner it would be if weights are actually
> included in the Jacobian.
> iIi) The inclusion of weight can be really optimized inside the
> LeastSquaresProblem, where depending on the problem they might be able
> to take it more efficiently than the general approach.
>
> 2) All computations inside the optimizers should be done using linear
> algebra library (or is the library so bad?). Not only does it make the
> code easier to follow, it can make it a lot faster. Imagine a
> notsohypothetical example where the Jacobian is actually sparse
> (I know you guys are removing sparse linear algebra, but lets say it
> gets put back at some point later). Forming the Hessian from the
> Jacobian is a very expensive operation (J^T*J), but if J is sparse the
> speedup could be automatically propagated through the algorithm.
I agree with 1 and 2. You seem to have more experience with optimizers
than me; would you like to update GaussNewton.doOptimize() to assume the
weights are already included?
>
> 3) Speaking of forming the Hessian, the actual computation of the
> descent step should be logically and explicitly separated into two
> separate function. I think the first function could actually be moved
> to the LeastSquaresProblem (need more thought on this)
> i) The first function will compute the step size p, where H*p = g, H
> is the Hessian and g is the gradient. The reason for this is there are
> several ways this problem can be solved, and that should be deferred
> to a specific implementation later. For example, in a very large
> leastsquares problem (or any large problem) you cannot actually form
> J^T*J (or H), and instead you solve the problem using conjugate
> gradient iterative method (which you already have in the library),
> where you compute J^T*J*x as J^T*(J*x). Again, here having the
> Jacobian be a matrix would really help future proof things. In the
> case of general nonlinear optimizers you might want to support in the
> future a very important quasiNewton algorithms BFGS or LBFGS, were
> this step is done very differently.
>
> ii) Once the step size is computed, a line search or trust region
> method is used to potentially modify the step. Here also bound
> constraints can be enforced, so an optimizer that supports it can
> overwrite this function. This part should also be exposed, at least in
> the abstract classes, from which the final leastsquares optimizer
> should be built.
I'm not sure I follow. Are you suggesting that the LSP class solve the
normal equations? Solving the normal equations and computing the next
step seems like the job of the "optimizer".
>
> Thanks,
> Konstantin
>
Ajo Fod wrote:
>
> I like the immutability idea. I wonder if there are any guidelines on when
> the performance hit because of immutability is high enough to want mutable
> objects and synchronization?
>
> Ajo
For me the dividing line is copying large matrices and arrays. I try to
avoid copying the residuals and the Jacobian because they can be
megabytes in size.
Regards,
Evan

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
