commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole Ersoy <ole.er...@gmail.com>
Subject Re: [Math] LeastSquaresOptimizer Design
Date Fri, 25 Sep 2015 14:36:18 GMT


On 09/25/2015 06:55 AM, Gilles wrote:
> On Thu, 24 Sep 2015 21:41:10 -0500, Ole Ersoy wrote:
>> On 09/24/2015 06:01 PM, Gilles wrote:
>>> On Thu, 24 Sep 2015 17:02:15 -0500, Ole Ersoy wrote:
>>>> On 09/24/2015 03:23 PM, Luc Maisonobe wrote:
>>>>> Le 24/09/2015 21:40, Ole Ersoy a écrit :
>>>>>> Hi Luc,
>>>>>>
>>>>>> I gave this some more thought, and I think I may have tapped out
to
>>>>>> soon, even though you are absolutely right about what an exception
does
>>>>>> in terms bubbling execution to a point where it stops or we handle
it.
>>>>>>
>>>>>> Suppose we have an Optimizer and an Optimizer observer. The optimizer
>>>>>> will emit three different events given in the process of stepping
>>>>>> through to the max number of iterations it is allotted:
>>>>>> - SOLUTION_FOUND
>>>>>> - COULD_NOT_CONVERGE_FOR_REASON_1
>>>>>> - COULD_NOT_CONVERGE_FOR_REASON_2
>>>>>> - END (Max iterations reached)
>>>>>>
>>>>>> So we have the observer interface:
>>>>>>
>>>>>> interface OptimizerObserver {
>>>>>>
>>>>>>      success(Solution solution)
>>>>>>      update(Enum enum, Optimizer optimizer)
>>>>>>      end(Optimizer optimizer)
>>>>>> }
>>>>>>
>>>>>> So if the Optimizer notifies the observer of `success`, then the
>>>>>> observer does what it needs to with the results and moves on.  If
the
>>>>>> observer gets an `update` notification, that means that given the
>>>>>> current [constraints, numbers of iterations, data] the optimizer
cannot
>>>>>> finish.  But the update method receives the optimizer, so it can
adapt
>>>>>> it, and tell it to continue or just trash it and try something
>>>>>> completely different.  If the `END` event is reached then the Optimizer
>>>>>> could not finish given the number of allotted iterations. The Optimizer
>>>>>> is passed back via the callback interface so the observer could allow
>>>>>> more iterations if it wants to...perhaps based on some metric indicating
>>>>>> how close the optimizer is to finding a solution.
>>>>>>
>>>>>> What this could do is allow the implementation of the observer to
throw
>>>>>> the exception if 'All is lost!', in which case the Optimizer does
not
>>>>>> need an exception.  Totally understand that this may not work
>>>>>> everywhere, but it seems like it could work in this case.
>>>>>>
>>>>>> WDYT?
>>>>> With this version, you should also pass the optimizer in case of
>>>>> success. In most cases, the observer will just ignore it, but in some
>>>>> cases it may try to solve another problem, or to solve again with
>>>>> stricter constraints, using the previous solution as the start point
>>>>> for the more stringent problem. Another case would be to go from a
>>>>> simple problem to a more difficult problem using some kind of
>>>>> homotopy.
>>>> Great - whoooh - glad you like this version a little better - for a
>>>> sec I thought I had complete lost it :).
>>>
>>> IIUC, I don't like it: it looks like "GOTO"...
>>
>> Inside the optimizer it would work like this:
>>
>> while (!done) {
>>    if (can't converge) {
>>        observer.update(Enum.CANT_CONVERGE, this);
>>    }
>> }
>
> That's fine. What I don't like is to have provision for changing the
> optimizer's settings and reuse the same instance.
If the design of the optimizer allows for this, then the interface for the Observer would
facilitate it.  The person implementing the interface could throw an exception when they get
the Enum.CANT_CONVERGE message, in which case the semantics are the same as they are now.

On the other hand if the optimizer is not designed for reuse, perhaps for the reason that
it causes more complexity than it's worth, the Observer interface could just exclude this
aspect.

> The optimizer should be instantiated at the lowest possible level; it
> will report everything to the observer, but the "report" is not to be
> confused with the "optimizer".
The design of the observer is flexible.  It gives the person implementing the interface the
ability to change the state of what is being observed.  It's a bit like warming up leftovers.
 You are the observer.  You grab yesterdays the pizza.  Throw in in the microwave.  The microwave
is the optimizer.  We hit the 30 second button, and check on the pizza.  If we like it, we
take it out, otherwise we hit 30 seconds again, or we throw the whole thing out, because we
just realized that the Pizza rat took a chunk out:
https://www.youtube.com/watch?v=UPXUG8q4jKU

>
>>
>> Then in the update method either modify the optimizer's parameters or
>> throw an exception.
>
> If I'm referring to Luc's example of a high-level code "H" call to some
> mid-level code "M" itself calling CM's optimizer "CM", then "M" may not
> have enough info to know whether it's OK to retry "CM", but on the other
> hand, "H" might not even be aware that "M" is using "CM".
So in this case the person implementing the Observer interface would keep the semantics that
we have now.  There is one important distinction though.  The person uses the Enum parameter,
indicating the root cause of the message, to throw their own (Meaningful to them) exception.

>
> As I tried to explain several times along the years (but failed to
> convince) is that the same problem exists with the exceptions: however
> detailed the message, it might not make sense to the person that reads
> the console because he is at level "H" and may have no idea that "CM"
> is used deep down.
Great point!  This is why I like receiving a light weight Enum indicating a root cause, and
then either adapting to it, or throwing my own exception that will trigger a simple explanation
for my client (Person using the app or remote client (Computer) receiving a message).

> Having a specific exception which "M" can catch, extract info from, and
> raise a more meaningful exception (and/or translate the message!) is a
> much more flexible solution IMO.
Indeed.  One option here, assuming a callback interface is not an option, is to move to a
one to one mapping between the class doing the math and the corresponding exception.  The
exception would then code root causes using an Enum that the receiver could use to map the
root cause to their implemented way of handling it.

Because it is designed this way, the exception handler can get access to the entire object
that caused the exception, and use it for the exception handling.  This eliminates most of
the thinking around how the exception should be designed, what it needs to communicate, how
it should be distinguished from all of the other contexts that can throw the exception, where
it belongs in the hierarchy, etc.

> [Well, if all iterative algorithms are rewritten within the "observer"
> paradigm, then the logging can indeed be left at the caller's level (since
> the optimizer will report "everything"...  Going that route is an option
> to be mentioned in issue of allowing "slf4j" or not (see below).]
I already started a new mail thread, but I will bring it up, if there are objections.  It
may be a poor fit in terms of developer productivity, since now everyone has to implement
the logging statements again.

>
>>>> The Optimizer could publish information deemed
>>>> interesting on each ITERATION event.
>>>
>>> If we'd go for an "OptimizerObserver" that gets called at every
>>> iteration,
>>> there shouldn't be any overlap between it and "Optimizer":
>> So inside the Optimizer we could have:
>>
>> while (!done) {
>>     ...
>>     if (observer.notifyOnIncrement())
>>     {
>>         observer.increment(this);
>>     }
>> }
>>
>> Which would give us an opportunity to cancel the run if, for example,
>> it's not converging fast enough.
>
> Providing ways to assess "too slow convergence" would be a very
> interesting feature, I think.
>
>> In that case we set done to true in
>> the observer, and then allow the Optimizer to get to the point where
>> it checks if it's done, calls the END notification on the observer,
>> and then the observer takes it from there.
>>
>>>
>>> iteration limit should be dealt with by the observer, the iterative
>>> algorithm would just run "forever" until the observer is satisfied
>>> with the current state (solution is good enough or the allotted
>>> resources - be they time, iterations, evaluations, ... - are
>>> exhausted).
>>
>> It's possible to do it that way, although I think it's better if that
>> code stays on the algorithm such that the Observer interface (The
>> client / person using CM implements the Observer) is as simple as
>> possible to implement.
>
> By definition, the iteration concept is also present in the "Observer".
> (via "notifyOnIncrement()", IIUC).
> If the observer is notified, it should act according to the caller's
> policy (e.g. call "optimizer.stop()").
> [Since the optimizer was stopped before completing the assignment (vs
> finding a solution within the tolerance settings), it should not be in
> charge of further action (e.g. "return" something).]
Right - once the optimizer is stopped, it's stopped.  However the semantics for doing that
work like this (For the reason that if the observer calls optimizer.done(), the optimizer
still has some code to run):

The observer tells the optimizer that it is done.

INSIDE OBSERVER:
optimizer.done();

Then the optimizer keeps going...until it exits the loop that it is in, because done = true;
 At that point it notifies the observer again.

INSIDE OPTIMIZER:
observer.end(Enum.CANT_CONVERGE);

So the above case is the model for when the optimizer does not find a solution.

If it finds a solution then it will naturally exit the loop that it is in and make the final
call:

observer.success(solution, optimizer) or just:
observer.success(optimizer) // In case the solution is bound to the optimizer

The argument list is flexible.  Once the observer.success is called the optimizer is done.
 It has no code left to run.

>
>
>>>> The observer could then be wired
>>>> with SLF4J and perform the same type of logging that the Optimizer
>>>> would perform.  So CM could declare SLF4J as a test dependency, and
>>>> unit tests could log iterations using it.
>>>
>>> As a "user", I'm interested in how the algorithms behave on my problem,
>>> not in the CM unit tests.
>> You could still do that.  I usually take my problem, simplify it down
>> to a data set that I think covers all corner cases, and then run it
>> through my unit tests while looking at the logging output to get an
>> idea of how my algorithm is behaving.
>
> When you "simplify", you don't the see how the (production) code really
> behaves.
> Not even mentioning that it takes a lot of time to "simplify", and might
> be impossible (e.g. if the production code runs in another environment).
Very true.  As you point out, I could be logging in my tests using the observer, but now I
have to reimplement the same logging pattern in my production code.

>
>>> The question remains unanswered: why not use slf4j directly?
>>
>> FWIU class path dependency conflicts for SLF4J are easily solved by
>> excluding logging dependencies that other libraries bring in and then
>> directly depending on the logging implementation that you want to use.
>> So people do run into issues, but I think they are solvable:
>>
>> http://stackoverflow.com/questions/8921382/maven-slf4j-version-conflict-when-using-two-different-dependencies-that-requi
>
> Then, could you please raise the question in a separate thread?
Done.

>
>>>> Lombok also has a @SLF4J annotation that's pretty sweet.  Saves the
>>>> SLF4J boilerplate.
>>>
>>> I understand that using annotations can be a time-saver, but IMO not
>>> so much for a library like CM; so in this case, the risk of depending
>>> on another library must be weighed against the advantages.
>> Lombok is compile time only, so there should be few drawbacks:
>> http://stackoverflow.com/questions/6107197/how-does-lombok-work
>
> Yes, I've just been wondering about that.
> So, could you please raise the question in a separate thread?
Done.


>
>> I'll demo it on the LevenbergMarquardtOptimizer experiment, and we
>> can see the level of code reduction we are able to achieve.  I think
>> it's going to be fairly significant.
>
> Great!
Sweet! :)

Cheers,
Ole


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message