commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Sterijevski <gsterijev...@gmail.com>
Subject Re: [math] SimpleRegression
Date Wed, 24 Aug 2011 02:06:19 GMT
Alright Phil!  Thunder and Lightening! I like it. ;-)

On Mon, Aug 22, 2011 at 10:55 PM, Phil Steitz <phil.steitz@gmail.com> wrote:

> On 8/22/11 8:42 PM, Greg Sterijevski wrote:
> > If no one has objections, I would like to harmonize simpleregression with
> > the Regression Interfaces. What is the best way to proceed?
>
> JFDI - go ahead and take a stab at a patch to do it.
>
> Phil
> >
> > On Sun, Aug 21, 2011 at 9:25 PM, Greg Sterijevski <
> gsterijevski@gmail.com>wrote:
> >
> >> Opened a ticket. Submitted patches. -Greg
> >>
> >>
> >> On Sat, Aug 20, 2011 at 4:47 PM, Phil Steitz <phil.steitz@gmail.com
> >wrote:
> >>
> >>> On 8/12/11 9:30 PM, Phil Steitz wrote:
> >>>> On 8/12/11 7:16 PM, Greg Sterijevski wrote:
> >>>>> Hello All,
> >>>>>
> >>>>> Before I chum the water with more JIRA tickets, I thought I would
see
> >>>>> whether people thought this was important.
> >>>>>
> >>>>> In the SimpleRegression you have two methods:
> >>>>>
> >>>>> public void addData(double x, double y) {
> >>>>>   ...some code that is not germane to discussion......
> >>>>>
> >>>>>         if (n > 2) {
> >>>>>             distribution = new TDistributionImpl(n - 2);
> >>>>>         }
> >>>>>     }
> >>>>>
> >>>>>     public void removeData(double x, double y) {
> >>>>>   ...some code that is not germane......
> >>>>>
> >>>>>             if (n > 2) {
> >>>>>                 distribution = new TDistributionImpl(n - 2);
> >>>>>             }
> >>>>>         }
> >>>>>     }
> >>>>>
> >>>>> >From the perspective of a user, you are likely to call add/remove
> >>> repeatedly
> >>>>> before you ever need to check for statistical significance. Wouldn't
> it
> >>> be
> >>>>> better to instantiate the TDistribution when it is needed?
> >>>>>
> >>>>> So you would have to make the following two methods a bit more
> >>> complicated:
> >>>>>  public double getSlopeConfidenceInterval(double alpha)
> >>>>>         throws MathException {
> >>>>>         if (alpha >= 1 || alpha <= 0) {
> >>>>>             throw new
> >>>>> OutOfRangeException(LocalizedFormats.SIGNIFICANCE_LEVEL,
> >>>>>                                           alpha, 0, 1);
> >>>>>         }
> >>>>>         if( distribution == null ||
> distribution.getDegreesOfFreedom()
> >>> !=
> >>>>> n-2){
> >>>>>           distribution = new TDistributionImpl(n - 2);
> >>>>> }
> >>>>>         return getSlopeStdErr() *
> >>>>>             distribution.inverseCumulativeProbability(1d - alpha
/
> 2d);
> >>>>>     }
> >>>>>
> >>>>> Similarly getSignificance() would have to be modified with the check
> of
> >>> the
> >>>>> degrees of freedom of the distribution.
> >>>>>
> >>>>> There is nothing wrong with the current code, but making the change
> >>> means
> >>>>> faster updates.
> >>>> Slightly, yes.  There is not much code at all in the distribution
> >>>> constructor, but you are correct.  Moreover, I can see now that the
> >>>> "immutability-everywhere" changes in trunk have made the code in the
> >>>> class a little funny.  In versions <=2.2, the TDistribution used
by
> >>>> the class was pluggable - i.e., there was a constructor that took at
> >>>> TDistribution as an argument, so if for some reason you wanted to
> >>>> use an impl different from TDistributionImpl, you could do that.
> >>>> There was also a setter for the distribution. In addition,
> >>>> TDistributionImpl itself was mutable, exposing a setDegreesOfFreedom
> >>>> method.  So the distribution member was set at construction time and
> >>>> the data update methods called the setter for DF on the
> >>>> distribution.  We decided to make the distributions immutable in 3.0
> >>>> (search the archives for discussion), so the current mods were done
> >>>> to basically accomplish that.  But the code should be cleaned up.
> >>>> The constructor taking an int is meaningless and should be
> >>>> deprecated or removed (unfortunately, we added that in 2.2 and
> >>>> advertised it as a deprecation replacement for the version that took
> >>>> a distribution as parameter.  We should have realized then that it
> >>>> was meaningless.  My bad for missing that. I would favor dropping it
> >>>> in 3.0, since even if anyone is using it, it isn't doing anything
> >>>> meaningful for them.)  Given that constructing a TDistributionImpl
> >>>> is trivial, we might as well eliminate the member field altogether
> >>>> and just create one when needed.  If you agree and no one else
> >>>> objects, I will make these changes.  Thanks for reviewing the code.
> >>> Tracked as MATH-648, fixed in r1159918.
> >>>
> >>> Phil
> >>>> Phil
> >>>>> Thoughts?
> >>>>>
> >>>>> -Greg
> >>>>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> >>> For additional commands, e-mail: dev-help@commons.apache.org
> >>>
> >>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message