commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles Sadowski <gil...@harfang.homelinux.org>
Subject Re: [math] major problem with new released version 3.1
Date Sat, 29 Dec 2012 11:34:30 GMT
On Sat, Dec 29, 2012 at 10:43:11AM +0100, Luc Maisonobe wrote:
> Le 29/12/2012 10:08, Luc Maisonobe a écrit :
> > Le 29/12/2012 04:09, Gilles Sadowski a écrit :
> >> Hi.
> > 
> > Hi Gilles,
> > 
> >>
> >>> [...]
> >>
> >>>> third is just bug-fix.  Does the fix for the issue that started this
> >>>> thread change the API?  If so, we would need to cut 3.2 in any case.
> >>
> >> The current fix does change the usage (one needs to call optimize with an
> >> argument of a different type). IIUC, it also silently removes the handling
> >> of uncorrelated observations.
> >>
> >>> Yes, this fixes the issue. I have created/resolved the issue (MATH-924)
> >>> and committed the fix as of r1426616.
> >>>
> >>> Could someone please review what I have done?
> >>
> >> I don't like the fix...
> > 
> > Thanks for reviewing.
> > 
> >>
> >> Handling weighted observations must take correlations into account, i.e. use
> >> a _matrix_.
> >> There is the _practical_ problem of memory. Solving it correctly is by using
> >> a sparse implementation (and this is actually an implementation _detail_).
> > 
> > Yes.
> > 
> >> If we _need_ such an implementation to solve the practical problem, I
> >> strongly suggest that we focus on creating it (or fixing what CM already
> >> has, or accept that some inconsistency will be present), rather than
> >> reducing the code applicability (i.e. allowing only uncorrelated data).
> >> If the observations are not correlated, the matrix is a diagonal matrix, not
> >> a vector.
> > 
> > It's fine with me. I simply thought it wouldn't be that easy. You proved
> > me wrong.
> > 
> >> CM also lacks implementations for symmetric, triangular, and diagonal matrix,
> >> which all would go some way to solving several practical problems of the same
> >> kind as the current one without sacrificing generality.
> > 
> > Yes, we have known that for years.
> > 
> >>
> >> Now, and several years ago, it was noticed that CM does not _have_ to
> >> provide the "weights" feature because users can handle this before they
> >> call CM code. [IIRC, no CM unit test actually use weights different from 1.]
> >> IMO, the other valid option is thus to have a simpler version of the
> >> algorithm, but still a correct one.
> >> This would also have the advantage that we won't have the urgent need to
> >> keep the sparse matrix implementation.
> >> [Then, if at some point we include helper code to handle weights
> >> (_including_ correlations), we should also do it in a "preprocessing" step,
> >> without touching the optimization algorithms.]
> > 
> > So what do you suggest: keep the current support (with proper handling
> > as you did) or drop it?
> > 
> > Since several people asked for removing it (Dimitri, Konstantin and now
> > you), we can do that. Unitl now, this feature was a convenience that was
> > really useful for some cases, and it was simple. There were some errors
> > in it and Dimitri solved them in 2010, but no other problems appeared
> > since them, so it made sense simply keeping it as is. Now we are hit by
> > a second problem, and it seems it opens a can of worm. Dropping it
> > completely as a not so useful convenience which is tricky to set up
> > right would be fair.
> > 
> >>
> >>> I also think (but did not test it), that there may be a problem with
> >>> missing OptimizationData. If someone call the optimizer but forget to
> >>> set the weight (or the target, or some other mandatory parameters), then
> >>> I'm not sure we fail with an appropriate error. Looking for example at
> >>> the private checkParameters method in the MultivariateVectorOptimizer
> >>> abstract class, I guess we may have a NullPointer Exception if either
> >>> Target or Weight/NonCorrelatedWeight has not been parsed in the
> >>> parseOptimizationData method. Could someone confirm this?
> >>
> >> Yes.
> >> And this (not checking for missing data) _could_ be considered a feature, as
> >> I stressed several times on this ML, and in the code documentation
> >> (eliciting zero constructive comment).
> >> We also _agreed_ that users not passing needed data will result in NPE.
> >> [I imagined that applications would check that valid and complete input is
> >> passed to the lower level "optimize(OptimizationData ...)" methods.]
> > 
> > Yes, abd I agree with that. However, I found the javadoc to be
> > ambiguous. It says "The following data will be looked for:" followed by
> > a list. There is no distinction between optional and required
> > parameters. As an example, simple bounds are optional whereas initial
> > guess or weight are required, but there is nothing to tell it to user.
> > So in this case, either we should provide proper exception or proper
> > documentation. I am OK with both.
> > 
> >>
> >> It is however straightforward to add a "checkParameters()" method that would
> >> raise more specific exceptions. [Although that would contradict the
> >> conclusion of the previous discussion about NPE in CM. And restart it,
> >> without even getting a chance to go forward with what had been decided!]
> > 
> > As long as we identify the parameters that are optional (and hence user
> > can deduce the other one are mandatory and will raise an NPE), this
> > would be fine. I don't ask to restart this tiring discussion, just make
> > sure users have the proper way to understand why they get an NPE when
> > the forget weight and why they don't get one when the forget simple bounds.
> > 
> > Also weight should really be optional and have a fallback to identity
> > matrix, but this is another story.
> > 
> >>
> >>
> >> Hence, to summarize:
> >>  * The fix, in a 3.2 release, should be to replace the matrix implementation
> >>    with one that does not exhaust the memory, e.g. "OpenMapRealMatrix"[1] or
> >>    "DiagonalMatrix" (see my patch for MATH-924), but not change the API.
> > 
> > +1
> 
> I forgot to ask: do you want me to first revert my change before you
> commit yours

Yes, please.

> or will you do both at the same time?

I wouldn't know how to do that.
What is the svn command to revert a list a files to their state in a given
revision?

Thanks,
Gilles

> 
> Luc
> 
> > 
> >>  * We must decide wether to deprecate the weights feature in that release
> >>    (and thus remove it in 4.0) or to keep it in its general form (and thus
> >>    un-deprecate "OpenMapRealMatrix"[2]).
> > 
> > +1 to deprecate.
> > 
> > best regards
> > Luc
> > 
> >>
> >>
> >> Best regards,
> >> Gilles
> >>
> >> [1] The inconsistencies that led to the deprecation will have no bearing
> >>     on usage within the optimizers.
> >> [2] The latter option seems likely to entail a fair amount of work to
> >>     improve the performance of "OpenMapRealMatrix" (which is quite poor
> >>     for some operations, e.g. "multiply").
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >> For additional commands, e-mail: dev-help@commons.apache.org
> >>
> >>
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message