commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luc Maisonobe <Luc.Maison...@free.fr>
Subject Re: [math] major problem with new released version 3.1
Date Sat, 29 Dec 2012 09:08:06 GMT
Le 29/12/2012 04:09, Gilles Sadowski a écrit :
> Hi.

Hi Gilles,

> 
>> [...]
> 
>>> third is just bug-fix.  Does the fix for the issue that started this
>>> thread change the API?  If so, we would need to cut 3.2 in any case.
> 
> The current fix does change the usage (one needs to call optimize with an
> argument of a different type). IIUC, it also silently removes the handling
> of uncorrelated observations.
> 
>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
>> and committed the fix as of r1426616.
>>
>> Could someone please review what I have done?
> 
> I don't like the fix...

Thanks for reviewing.

> 
> Handling weighted observations must take correlations into account, i.e. use
> a _matrix_.
> There is the _practical_ problem of memory. Solving it correctly is by using
> a sparse implementation (and this is actually an implementation _detail_).

Yes.

> If we _need_ such an implementation to solve the practical problem, I
> strongly suggest that we focus on creating it (or fixing what CM already
> has, or accept that some inconsistency will be present), rather than
> reducing the code applicability (i.e. allowing only uncorrelated data).
> If the observations are not correlated, the matrix is a diagonal matrix, not
> a vector.

It's fine with me. I simply thought it wouldn't be that easy. You proved
me wrong.

> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
> which all would go some way to solving several practical problems of the same
> kind as the current one without sacrificing generality.

Yes, we have known that for years.

> 
> Now, and several years ago, it was noticed that CM does not _have_ to
> provide the "weights" feature because users can handle this before they
> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
> IMO, the other valid option is thus to have a simpler version of the
> algorithm, but still a correct one.
> This would also have the advantage that we won't have the urgent need to
> keep the sparse matrix implementation.
> [Then, if at some point we include helper code to handle weights
> (_including_ correlations), we should also do it in a "preprocessing" step,
> without touching the optimization algorithms.]

So what do you suggest: keep the current support (with proper handling
as you did) or drop it?

Since several people asked for removing it (Dimitri, Konstantin and now
you), we can do that. Unitl now, this feature was a convenience that was
really useful for some cases, and it was simple. There were some errors
in it and Dimitri solved them in 2010, but no other problems appeared
since them, so it made sense simply keeping it as is. Now we are hit by
a second problem, and it seems it opens a can of worm. Dropping it
completely as a not so useful convenience which is tricky to set up
right would be fair.

> 
>> I also think (but did not test it), that there may be a problem with
>> missing OptimizationData. If someone call the optimizer but forget to
>> set the weight (or the target, or some other mandatory parameters), then
>> I'm not sure we fail with an appropriate error. Looking for example at
>> the private checkParameters method in the MultivariateVectorOptimizer
>> abstract class, I guess we may have a NullPointer Exception if either
>> Target or Weight/NonCorrelatedWeight has not been parsed in the
>> parseOptimizationData method. Could someone confirm this?
> 
> Yes.
> And this (not checking for missing data) _could_ be considered a feature, as
> I stressed several times on this ML, and in the code documentation
> (eliciting zero constructive comment).
> We also _agreed_ that users not passing needed data will result in NPE.
> [I imagined that applications would check that valid and complete input is
> passed to the lower level "optimize(OptimizationData ...)" methods.]

Yes, abd I agree with that. However, I found the javadoc to be
ambiguous. It says "The following data will be looked for:" followed by
a list. There is no distinction between optional and required
parameters. As an example, simple bounds are optional whereas initial
guess or weight are required, but there is nothing to tell it to user.
So in this case, either we should provide proper exception or proper
documentation. I am OK with both.

> 
> It is however straightforward to add a "checkParameters()" method that would
> raise more specific exceptions. [Although that would contradict the
> conclusion of the previous discussion about NPE in CM. And restart it,
> without even getting a chance to go forward with what had been decided!]

As long as we identify the parameters that are optional (and hence user
can deduce the other one are mandatory and will raise an NPE), this
would be fine. I don't ask to restart this tiring discussion, just make
sure users have the proper way to understand why they get an NPE when
the forget weight and why they don't get one when the forget simple bounds.

Also weight should really be optional and have a fallback to identity
matrix, but this is another story.

> 
> 
> Hence, to summarize:
>  * The fix, in a 3.2 release, should be to replace the matrix implementation
>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.

+1

>  * We must decide wether to deprecate the weights feature in that release
>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>    un-deprecate "OpenMapRealMatrix"[2]).

+1 to deprecate.

best regards
Luc

> 
> 
> Best regards,
> Gilles
> 
> [1] The inconsistencies that led to the deprecation will have no bearing
>     on usage within the optimizers.
> [2] The latter option seems likely to entail a fair amount of work to
>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>     for some operations, e.g. "multiply").
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message