commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From S├ębastien Brisard <sebastien.bris...@m4x.org>
Subject Re: [math] major problem with new released version 3.1
Date Sat, 29 Dec 2012 09:04:35 GMT
Hi,


2012/12/29 Gilles Sadowski <gilles@harfang.homelinux.org>

> Hi.
>
> > [...]
>
> > > third is just bug-fix.  Does the fix for the issue that started this
> > > thread change the API?  If so, we would need to cut 3.2 in any case.
>
> The current fix does change the usage (one needs to call optimize with an
> argument of a different type). IIUC, it also silently removes the handling
> of uncorrelated observations.
>
> > Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
> > and committed the fix as of r1426616.
> >
> > Could someone please review what I have done?
>
> I don't like the fix...
>
> Handling weighted observations must take correlations into account, i.e.
> use
> a _matrix_.
> There is the _practical_ problem of memory. Solving it correctly is by
> using
> a sparse implementation (and this is actually an implementation _detail_).
> If we _need_ such an implementation to solve the practical problem, I
> strongly suggest that we focus on creating it (or fixing what CM already
> has, or accept that some inconsistency will be present), rather than
> reducing the code applicability (i.e. allowing only uncorrelated data).
> If the observations are not correlated, the matrix is a diagonal matrix,
> not
> a vector.
> CM also lacks implementations for symmetric, triangular, and diagonal
> matrix,
> which all would go some way to solving several practical problems of the
> same
> kind as the current one without sacrificing generality.
>
> Now, and several years ago, it was noticed that CM does not _have_ to
> provide the "weights" feature because users can handle this before they
> call CM code. [IIRC, no CM unit test actually use weights different from
> 1.]
> IMO, the other valid option is thus to have a simpler version of the
> algorithm, but still a correct one.
> This would also have the advantage that we won't have the urgent need to
> keep the sparse matrix implementation.
> [Then, if at some point we include helper code to handle weights
> (_including_ correlations), we should also do it in a "preprocessing" step,
> without touching the optimization algorithms.]
>
> > I also think (but did not test it), that there may be a problem with
> > missing OptimizationData. If someone call the optimizer but forget to
> > set the weight (or the target, or some other mandatory parameters), then
> > I'm not sure we fail with an appropriate error. Looking for example at
> > the private checkParameters method in the MultivariateVectorOptimizer
> > abstract class, I guess we may have a NullPointer Exception if either
> > Target or Weight/NonCorrelatedWeight has not been parsed in the
> > parseOptimizationData method. Could someone confirm this?
>
> Yes.
> And this (not checking for missing data) _could_ be considered a feature,
> as
> I stressed several times on this ML, and in the code documentation
> (eliciting zero constructive comment).
> We also _agreed_ that users not passing needed data will result in NPE.
> [I imagined that applications would check that valid and complete input is
> passed to the lower level "optimize(OptimizationData ...)" methods.]
>
> It is however straightforward to add a "checkParameters()" method that
> would
> raise more specific exceptions. [Although that would contradict the
> conclusion of the previous discussion about NPE in CM. And restart it,
> without even getting a chance to go forward with what had been decided!]
>
>
> Hence, to summarize:
>  * The fix, in a 3.2 release, should be to replace the matrix
> implementation
>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1]
> or
>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
>  * We must decide wether to deprecate the weights feature in that release
>    (and thus remove it in 4.0) or to keep it in its general form (and thus
>    un-deprecate "OpenMapRealMatrix"[2]).
>
>
> Best regards,
> Gilles
>
> [1] The inconsistencies that led to the deprecation will have no bearing
>     on usage within the optimizers.
> [2] The latter option seems likely to entail a fair amount of work to
>     improve the performance of "OpenMapRealMatrix" (which is quite poor
>     for some operations, e.g. "multiply").
>
> I have never been happy with deprecating the sparse implementations of
vectors and matrices. I also think we should "undeprecate" them and work on
them, with some help!
Regarding [1], my choice would be to point at the problematic cases in the
Javadoc (and possibly the user's guide), and live with them.
Regarding [2], would there be any performance gain if we designed an
immutable sparse vector/matrix. I guess it would. The question is: do we
need these objects to be mutable?

S├ębastien

> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message