commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sébastien Brisard <sebastien.bris...@m4x.org>
Subject Re: [math] Deprecating "guessParametersErrors"
Date Fri, 04 May 2012 13:42:29 GMT
2012/5/4 Gilles Sadowski <gilles@harfang.homelinux.org>:
> Hello.
>
>> I'm obviously missing something in my litterature  review. I did a new
>> MC simulation, with a much smaller number of observation points
>> (namely 3, to fit a straight line!!!). It turns out that the formula
>> you are advocating for is the best estimate of the standard deviation
>> of the parameters. Could you please explain why this fomula differs
>> from formulas (34) and (35) in
>> http://mathworld.wolfram.com/LeastSquaresFitting.html?
>
> Independently of the explanation to be provided by Dimitri, I think that
> there are code design arguments in favour of deprecating (and later,
> deleting) the "guessParametersErrors" method, as follows.
>
> In the context of the "optimization.general" package, one assumes that a
> Jacobian matrix is available. From there, the code in "AbstractLeastSquares"
> computes the covariance matrix, from which one can readily extract the
> "sigma".
> This can be done without computing the chi-square! [While, as you have
> probably noticed, the "guessParametersErrors" will not behave nicely if you
> don't call "updateResidualsAndCost()" beforehand.]
>
> For the class to be self-consistent, the story can end here: Any additional
> utilities can lead to wrong expectations from different types of users (as
> we've demonstrated here).
> Indeed, confidence intervals refer to additional variables (as Dimitri
> wrote: "By how much can a parameter change before the normalized chi2
> changes by <some number>?"). Being able to answer those questions also
> involves the correlations between the parameters (cf. the plot I've attached
> to MATH-784), whereas "guessParametersErrors" does not take them into
> account.
>
>> I hope I'm not bothering you too much. I really would like to
>> understand, so that we could write an accurate javadoc and possibly
>> rename the method appropriately.
>
> For clarity's sake (design-wise), I propose to remove the
> "guessParametersErrors" method, and add a "getSigma" (as syntactic sugar).
>
I'm OK with that. As a first step, we deprecate it, and stipulate in
the javadoc that getSigma() should be used instead. We emphasize that
both methods do not provide exactly the same value. Similarly, I
propose that the javadoc of getSigma() states exactly what it returns
(namely, sqrt(cov[i][i])). Finally, do you think its worth calling
this method getSigmaParameters() in order to avoid confusion with sd
on the observations (which are implicitly assumed by the weights in
the chis-square)?

The tests I've recently added (NIST data) must be altered a bit, so
maybe I could take of the whole thing if you want. Then we could
consider that MATH-784 is resolved.

>
> If you want to dig further into the confidence interval issues in order to
> provide the related functionality (similar to, but not limited to, the
> current "guessParametersErrors"), I propose that that code be located in the
> "stat" package (where, by the way, some of the utilities might already
> exist!).
>
Maybe in the future I will dig into this problem, I find it very
interesting!!! For the time being, I'd rather concentrate on other
issues such as the current bugs reported in the distribution package
(been working a lot on this recently, even if I have not yet come up
with a patch...).
> What do you think?
>
>
> Best regards,
> Gilles
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message