commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Berlin <kber...@gmail.com>
Subject Re: [math] major problem with new released version 3.1
Date Sat, 29 Dec 2012 23:53:48 GMT
Hi,

In my opinion is that the whole weights fiasco is consequence of improper design, as much
as anything else. All components should be as simple as possible, with any additional add
ons, like weights, not added to the base implementation, but instead done as an extension
of these classes. If these was done, all the optimization packages would have been just fine,
and only the extension classes would need fixing. 

I am very against having a correlation function be an input to the basic optimizers. The eigendecomposition
of the matrix is an O(N^3) operation, which could be actually more expensive than the whole
optimization. In addition you are doing this inversion every time you call the optimize method.
If you are trying to do multiple starting points, you are forcing an inversion to be done
each time.


On Dec 29, 2012, at 7:29 AM, Gilles Sadowski <gilles@harfang.homelinux.org> wrote:

> On Sat, Dec 29, 2012 at 10:22:20AM +0100, Dimitri Pourbaix wrote:
>> Gilles,
>> 
>>> Handling weighted observations must take correlations into account, i.e. use
>>> a _matrix_.
>>> There is the _practical_ problem of memory. Solving it correctly is by
>>> using a sparse implementation (and this is actually an implementation
>>> _detail_).
>> 
>> The problem is where something becomes a detail!  You are right that the
>> general least square problem copes with a matrix of weights ... but the
>> way it is implemented is a detail.
> 
> That's what I said above, although I suspect that we don't mean the same
> thing. OO programming allows to define types that will represent the
> "real" concepts: in this case, if the problem is expressed in terms of a
> a (mathematical) matrix, the algorithm should use a "Matrix" (type).
> This is not an implementation detail; the goal is for the code to be as
> close as possible to the mathematical description of the procedure
> (self-documenting code).
> 
> The implementation detail is how the matrix type stores its data internally;
> and this can be the subject of any necessary efficiency improvements,
> independently of the matrix concept used at a higher level (e.g. in the
> optimization algorithms).
> 
>> As already pointed out, even the
>> vector of weights API allows for a complicated matrix of weights.  The user
>> premultiplies by the 'square root' of that matrix and sets all the compo-
>> nents of the weight vector to 1.  So, your enthusiasm to generalise the
>> vector of weights to a matrix was a detail to make the life of very few
>> users easier ... without adding any functionality.
> 
> This is a backward description of my change.
> In reality:
> 1. The handling of weights was there.
> 2. Assuming that people wanted to keep it, I added the functionality to
>   handle correlated observations.
> 
> If indeed the weight feature is independent of the optimization procedure,
> then _all_ references to weights should be banned.
> [If just because keeping an array of "ones" and doing loops that "multiply
> by one" are obviously not going to improve clarity and performance.]
> Eventually, this seems to be the accepted compromise now (IIUC).
> 
>> There are so many different configurations (e.g. block diagonal, ...), I
>> doubt you can handle all of them in the most efficient way
> 
> Actually, my "Weight" class trivially handles _any_ "RealMatrix" (thanks to
> inheritance!).
> 
>> so it is likely
>> preferable to have the user taking care of them.
> 
> This is exactly what "Weight" does.
> The problem is that CM does not provide efficient implementations for
> matrix forms suited for this context (symmetric, sparse, diagonal).[1]
> 
> Above and in the previous post, I agreed that this would not be a problem i
> we entirely drop the support for weights in the optimizers.
> 
>> It is however true that simple weights (i.e. vector form) are a very usual
>> situation ... which is also very common in fitting tools.  So, I think CM
>> should offer that approach as well.
> 
> Where? In the fitting tools or in the optimizers?
> We just said that weights could be handled independenttly from the
> optimization procedure. But we could indeed put weights back where they are
> most useful (e.g. in the curve fitting) without dragging everywhere (where
> most of the time they'd be equal to one...).
> 
>> In conclusion: the old CM 3.0 API was enough! :)
> 
> If that's so, then people can just copy/paste the source code of that
> version and not care about subsequent versions of CM.
> 
> 
> Cordially,
> Gilles
> 
> [1] Actually, the problem is that some people complain that we don't do
>    enough to their taste: In the past, at least 3 persons raised issues
>    with matrix implementations, but without providing any useful help,
>    unfortunately (to be clear, I'm not talking of current contributors!).
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message