On Tue, 20 Aug 2013 08:54:59 0400, Evan Ward wrote:
> Hi all,
>
> Sorry for the delay. I'll try switching all of the least squares
> package
> to the new/seperable API and post the results to github. From there
> we
> can review, discuss and then decide if it makes sense to switch the
> rest
> of the optimizers.
Instead (or in addition to that), please open an "Improvement" request
on
the Commons projects' bugtracking system:
https://issues.apache.org/jira/browse/MATH
[It's a pity that we can't seem to open a "guest space" on Apache's SVN
repository to enable smoother collaborative work, and would have to
resort
to an external platform!]
You should then attach the files to the report page. But please do that
only after running
mvn site
on your computer, and checking that no problem is reported by
"CheckStyle"
and "FindBugs" (the reports are created in directory "target").
Also, please do not forget the unit test classes. Yes, I know that this
is a lot of work... :}
>
> On 08/15/2013 09:12 AM, Konstantin Berlin wrote:
>> Hi,
>>
>> As a quick first pass, I will comment on Evans code only for now.
>> Most
>> of my comments have to do with numerical aspects and associated
>> design, rather than threading.
>>
>> I like the idea of the optimize function taking in a Problem class.
>> I
>> think it can add a lot of flexibility going forward.
>>
>> First I think to make things clear, LeastSquares should be renamed
>> to
>> IterativeLeastSquares. Since for unconstrained linear least squares,
>> this whole hierarchy is not correct. For one thing they don't need
>> an
>> initial guess, for another, linear least squares can factor the
>> matrix
>> before experimental data is supplied, so it can support rapid
>> recomputation.
>
> From my background the problem is more commonly called General Least
> Squares, or NonLinear Least Squares. I shortened it to avoid (yet
> another) long name. :) I don't plan to include linear least squares
> problem in the new API. It seems hard to be simpler than new
> QRDecomposition(A).getSolver().solve(b). :)
>
>>
>> There are several issues that I have with current example
>>
>> 1) While the leastsquares problem now encapsulates the weights,
>> which
>> I like, for some reason the weights are explicitly used in the
>> actual
>> GaussNewtonOptimizer and probably all other leastsquares optimizer.
>> I
>> think that is bad for several reasons:
>> i) the actual optimizer algorithm has nothing to do with weights, so
>> it logically inconsistent. It forces people who have no weights to
>> still go through the actual numerical computation.
>> ii) it makes it harder to maintain and update the optimizer code.
>> Just
>> think about how much cleaner it would be if weights are actually
>> included in the Jacobian.
>> iIi) The inclusion of weight can be really optimized inside the
>> LeastSquaresProblem, where depending on the problem they might be
>> able
>> to take it more efficiently than the general approach.
>>
>> 2) All computations inside the optimizers should be done using
>> linear
>> algebra library (or is the library so bad?). Not only does it make
>> the
>> code easier to follow, it can make it a lot faster. Imagine a
>> notsohypothetical example where the Jacobian is actually sparse
>> (I know you guys are removing sparse linear algebra, but lets say it
>> gets put back at some point later). Forming the Hessian from the
>> Jacobian is a very expensive operation (J^T*J), but if J is sparse
>> the
>> speedup could be automatically propagated through the algorithm.
>
> I agree with 1 and 2. You seem to have more experience with
> optimizers
> than me; would you like to update GaussNewton.doOptimize() to assume
> the
> weights are already included?
We already discussed that some time ago.
Let's not mix different issues. Removing the handling of weights can
dealt
with later.
>
>>
>> 3) Speaking of forming the Hessian, the actual computation of the
>> descent step should be logically and explicitly separated into two
>> separate function. I think the first function could actually be
>> moved
>> to the LeastSquaresProblem (need more thought on this)
>> i) The first function will compute the step size p, where H*p = g,
>> H
>> is the Hessian and g is the gradient. The reason for this is there
>> are
>> several ways this problem can be solved, and that should be deferred
>> to a specific implementation later. For example, in a very large
>> leastsquares problem (or any large problem) you cannot actually
>> form
>> J^T*J (or H), and instead you solve the problem using conjugate
>> gradient iterative method (which you already have in the library),
>> where you compute J^T*J*x as J^T*(J*x). Again, here having the
>> Jacobian be a matrix would really help future proof things. In the
>> case of general nonlinear optimizers you might want to support in
>> the
>> future a very important quasiNewton algorithms BFGS or LBFGS, were
>> this step is done very differently.
>>
>> ii) Once the step size is computed, a line search or trust region
>> method is used to potentially modify the step. Here also bound
>> constraints can be enforced, so an optimizer that supports it can
>> overwrite this function. This part should also be exposed, at least
>> in
>> the abstract classes, from which the final leastsquares optimizer
>> should be built.
>
> I'm not sure I follow. Are you suggesting that the LSP class solve
> the
> normal equations? Solving the normal equations and computing the next
> step seems like the job of the "optimizer".
>
>>
>> Thanks,
>> Konstantin
>>
>
> Ajo Fod wrote:
>>
>> I like the immutability idea. I wonder if there are any guidelines
>> on when
>> the performance hit because of immutability is high enough to want
>> mutable
>> objects and synchronization?
>>
>> Ajo
>
> For me the dividing line is copying large matrices and arrays. I try
> to
> avoid copying the residuals and the Jacobian because they can be
> megabytes in size.
Copying is either because of the weights or because of attempts to be
threadsafe.
The former will be solved if weights are ultimately removed.
The latter could become an option when the data is offloaded to the
"problem" class (and interface instances with different properties can
be
built by alternate factory methods).
Thanks,
Gilles

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
