commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ole Ersoy <ole.er...@gmail.com>
Subject Re: [Math] LeastSquaresOptimizer Design
Date Tue, 22 Sep 2015 00:55:15 GMT
Hola,

On 09/21/2015 04:15 PM, Gilles wrote:
> Hi.
>
> On Sun, 20 Sep 2015 15:04:08 -0500, Ole Ersoy wrote:
>> On 09/20/2015 05:51 AM, Gilles wrote:
>>> On Sun, 20 Sep 2015 01:12:49 -0500, Ole Ersoy wrote:
>>>> Wanted to float some ideas for the LeastSquaresOptimizer (Possibly
>>>> General Optimizer) design.  For example with the
>>>> LevenbergMarquardtOptimizer we would do:
>>>> `LevenbergMarquardtOptimizer.optimize(OptimizationContext c);`
>>>>
>>>> Rough optimize() outline:
>>>> public static void optimise() {
>>>> //perform the optimization
>>>> //If successful
>>>>     c.notify(LevenberMarquardtResultsEnum.SUCCESS, solution);
>>>> //If not successful
>>>>
>>>>
>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_COST_RELATIVE_TOLERANCE,
>>>> diagnostic);
>>>> //or
>>>>
>>>>
>>>> c.notify(LevenberMarquardtResultsEnum.TOO_SMALL_PARAMETERS_RELATIVE_TOLERANCE,
>>>> diagnostic)
>>>> //etc
>>>> }
>>>>
>>>> The diagnostic, when turned on, will contain a trace of the last N
>>>> iterations leading up to the failure.  When turned off, the Diagnostic
>>>> instance only contains the parameters used to detect failure. The
>>>> diagnostic could be viewed as an indirect way to log optimizer
>>>> iterations.
>>>>
>>>> WDYT?
>>>
>>> I'm wary of having several different ways to convey information to the
>>> caller.
>> It would just be one way.
>
> One way for optimizer, one way for solvers, one way for ...

Yes I see what you mean, but I think on a whole it will be worth it to add additional sugar
code that removes the need for exceptions.

>
>> But the caller may not be the receiver
>> (It could be).  The receiver would be an observer attached to the
>> OptimizationContext that implements an interface allowing it to observe
>> the optimization.
>
> I'm afraid that it will add to the questions of what to put in the
> code and how.  [We already had sometimes heated discussions just for
> the IMHO obvious (e.g. code formatting, documentation, exception...).]

Hehe.  Yes I remember some of these discussions.  I wonder how much time was spent debating
the exceptions alone?  Surely everyone must have had this feeling in pit of their stomach
that there's got to be a better way.  On the exception topic, these are some of the issues:

I18N
===================
If you are new to commons math and thinking about designing a commons math compatible exception
you should probably understand the I18N stuff that's bound to exception (and wonder why it's
bound the the exception).  Grab a coffee and spend a few hours, unless you are obviously fairly
new to Java like some ofthe people posting for help.  In this case when the exception occurs,
there is going to be a lot of tutoring going on on the users list.

Number of Exceptions
===================
Before you do actually design a new exception, you should probably see if there is an exception
that already fits the category of what you are doing.  So you start reading.  Exception1...nop
Exception2...nop...Exception3...Exception999..But I think I'm getting warmer.  OK - Did not
find it ... but I'm fairly certain that there is a elegant place for it somewhere in the exception
hierarchy...


Handling of Exceptions
===================
If our app uses several of the commons math classes (That throw exceptions of the same type),
and one of those classes throws an exception,what is the app supposed to do?

I think most developers would find that question somewhat challenging.  There are numerous
strategies.  Catch all exceptions and log what happened, etc.  But what if the requirement
is that if an exception is thrown, the organization that receives it has 0 seconds to get
to the root cause of it and understand the dynamics. Is this doable?  (Yes obviously, but
how hard is it...?).


>>> It seems that the reporting interfaces could quickly overwhelm
>>> the "actual" code (one type of context per algorithm).
>> There would one type of Observer interface per algorithm.  It would
>> act on the solution and what are currently exceptions, although these
>> would be translated into enums.
>
> Unless I'm mistaken, the most common use-case for codes implemented
> in a library such as CM is to provide a correct answer or bail out
> in a non-equivocal way.
Most java developers are used to synchronous coding...call the method get the response...catch
the exception if needed.  This is changing with JDK8, and as we evolve and start using lambdas,
we become more accustomed to the functional callback style of programming.  Personally I want
to be able to use an API that gives me what I need when everything works as expected, allows
me to resolve unexpected issues with minimal effort, and is as simple, fluid, and lightweight
as possible.

>
> It would make the code more involved to handle a minority of
> (undefined) cases. [Actual examples would be welcome in order to
> focus the discussion.]

Rough Outline (I've evolved the concept and moved away from the OptimizationContext in the
process of writing):

interface LevenbergMarquardtObserver {

     public void hola(Solution s);
     public void sugarHoneyIceTea(ResultType rt, Dianostics d);
}

public class LMObserver implements LevenbergMarquardtObserver {

    private Application application;

    public LMObserver(Application application) {
        this.application = application;
    }

    public void hola(ResultType rt, Solution s) {
                 application.next(solution);
    }

    public void sugarHoneyIceTea(ResultType rt, Diagnostic s)
        if (rt == ResultType.I_GOT_THIS_ONE) {
             //I looked at the commons unit tests for this algorithm evaluating
             //the diagnostics that shows how this failure can occur
             //I'm totally fixing this!  Steps aside!
        }
        else if (rt == ResultType.REALLY_COMPLICATED_STUFF)
        {
            //We need our best engineers...call India.
        }
   )


public class Application {
     //Note nothing is returned.
     LevenberMarquardtOptimizer.setOberver(new LMObserver(this)).setLeastSquaresProblem(new
ClassThatImplementsTheProblem())).start();

     public void next(Solution solution) {

         //Do cool stuff.

     }
}

Or an asynchronous variation:

public class Application {
//This call will not block because async is true
     LevenberMarquardtOptimizer.setAsync(true).setOberver(new LMObserver()).setLeastSquaresProblem(new
ClassThatImplementsTheProblem())).start();

     //Do more stuff right away.

     public void next(Solution solution) {
         //When the thread running the optimization is done, this method is called back.
         //Do whatever comes next
     }
}

The above would start the optimization in a separate thread that does not / SHOULD NOT share
data with the main thread.

>
>>> The current reporting is based on exceptions, and assumes that if no
>>> exception was thrown, then the user's request completed successfully.
>> Sure - personally I'd much rather deal with something similar to an
>> HTTP status code in a callback, than an exception .  I think the code
>> is cleaner and the calback makes it more elegant to apply an adaptive
>> approach to handling the response, like slightly relaxing constraints,
>> convergence parameters, etc.  Also by getting rid of the exceptions,
>> we no longer depend on the I18N layer that they are tied to and now
>> the messages can be more informative, since they target the root
>> cause.  The observer can also run in the 'main' thread' while the
>> optimization can run asynchronously.  Also WRT JDK9 and modules,
>> loosing the exceptions would mean one less dependency when the library
>> is up into JDK9 modules...which would be more in line with this
>> philosophy:
>> https://github.com/substack/browserify-handbook#module-philosophy
>
> I'm not sure I fully understood the philosophy from the text in this
> short paragraph.
> But I do not agree with the idea that the possibility to quickly find
> some code is more important than standards and best practices.

If you go to npmjs.org and type in Neural Network you will get 56 results all linked to github
repositories.

In addition there's meta data indicating number of downloads in the last day, last month,
etc.  Try typing in cosine.  Odds are you will find a package that does just want you want
and nothing else.  This is very underwhelming and refreshing in terms of cloning off of github
and getting familar with tests etc.  Also eye opening.  How many of us knew that we could
do that much stuff with cosine! :).

>
>>> I totally agree that in some circumstances, more information on the
>>> inner working of an algorithm would be quite useful.
>> ... Algorithm iterations become unit testable.
>>>
>>> But I don't see the point in devoting resources to reinvent the wheel:
>> You mean pimping the wheel?  Big pimpin.
>
> I think that logging statements are easy to add, not disruptive at all,
> and come in handy to understand a code's unexpected behaviour.
> Assuming that a "logging" feature is useful, it can be added *now* using
> a dependency towards a weight-less (!) framework such as "slf4j".
> IMO, it would be a waste of time to implement a new communication layer
> that can do that, and more, if it would be used for logging only in 99%
> of the cases.
SLF4J is used by almost every other framework, so why not use it? Logging and the diagnostic
could be used together.  The primary purpose of the diagnostic though is to collect data that
will be useful in `sugarHoneyIceTea`.

>
>>>
>>> I longed several times for the use of a logging library.
>>> The only show-stopper has been the informal "no-dependency" policy...
>> JDK9 Jigsaw should solve dependency hell, so the less coupling
>> between commons math classes the better.
>
> I wouldn't call "coupling" the dependency towards exception classes:
> they are little utilities that can make sense in various parts of the
> library.

If for example the Simplex solver is broken off into it's own module, then it has to be coupled
to the exceptions, unless it is exception free.

>
> [Unless one wants to embark on yet another discussion about exceptions;
> whether there should be one class for each of the "messages" that exist
> in "LocalizedFormats"; whether localization should be done in CM;
> etc.]

I think it would be best to just eliminate the exceptions.

>
>> Anyways I'm obviously
>> interested in playing with this stuff, so when I get something up into
>> a repository I'll to do a callback :).
>
> If you are interested in big overhauls, there is one that gathered
> relative consensus: rewrite the algorithms in a "multithread-friendly"
> way.
I think that's a tall order that will take us into JDK88 :).  But using callbacks and making
potentially long running computations asynchronous could be a middle ground that would allow
simple multi threaded use without fiddling around under the hood...

>
> Some ideas were floated (cf. ML archive) but no implementation or
> experiment...  Perhaps with a well-defined goal such as performance
> improvement, your design suggestions will become clearer to more people.
>
> AFAIK, only the classes in the "o.a.c.m.neuralnet" package are currently
> ready to be used with the "java.util.concurrent" framework.
FWIU Neural Nets are a great fit for concurrency.  I think for the others we will end up having
discussions around how users would control the number of threads, etc. again that makes some
of us nervous.  An asynchronous operation that runs in one separate thread is easier to reason
about.  If we want to test 10 neural net configurations, and we have 10 cores, then we can
start each by itself by doing something like:

Nework.setAsync(true).addNeurons().connectNeurons().addObserver(observer).start().
//Now do 10 more
//If the observer is shared then notifications should be thread safe.

Cheers,
- Ole

P.S. Dang that was a long email.  If I write one more of these, ban me :)

>
>
> Best regards,
> Gilles
>
>>
>> Cheers,
>> Ole
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message