commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Sterijevski <gsterijev...@gmail.com>
Subject Re: [math] SimpleRegression
Date Tue, 23 Aug 2011 03:42:41 GMT
If no one has objections, I would like to harmonize simpleregression with
the Regression Interfaces. What is the best way to proceed?

On Sun, Aug 21, 2011 at 9:25 PM, Greg Sterijevski <gsterijevski@gmail.com>wrote:

> Opened a ticket. Submitted patches. -Greg
>
>
> On Sat, Aug 20, 2011 at 4:47 PM, Phil Steitz <phil.steitz@gmail.com>wrote:
>
>> On 8/12/11 9:30 PM, Phil Steitz wrote:
>> > On 8/12/11 7:16 PM, Greg Sterijevski wrote:
>> >> Hello All,
>> >>
>> >> Before I chum the water with more JIRA tickets, I thought I would see
>> >> whether people thought this was important.
>> >>
>> >> In the SimpleRegression you have two methods:
>> >>
>> >> public void addData(double x, double y) {
>> >>   ...some code that is not germane to discussion......
>> >>
>> >>         if (n > 2) {
>> >>             distribution = new TDistributionImpl(n - 2);
>> >>         }
>> >>     }
>> >>
>> >>     public void removeData(double x, double y) {
>> >>   ...some code that is not germane......
>> >>
>> >>             if (n > 2) {
>> >>                 distribution = new TDistributionImpl(n - 2);
>> >>             }
>> >>         }
>> >>     }
>> >>
>> >> >From the perspective of a user, you are likely to call add/remove
>> repeatedly
>> >> before you ever need to check for statistical significance. Wouldn't it
>> be
>> >> better to instantiate the TDistribution when it is needed?
>> >>
>> >> So you would have to make the following two methods a bit more
>> complicated:
>> >>
>> >>  public double getSlopeConfidenceInterval(double alpha)
>> >>         throws MathException {
>> >>         if (alpha >= 1 || alpha <= 0) {
>> >>             throw new
>> >> OutOfRangeException(LocalizedFormats.SIGNIFICANCE_LEVEL,
>> >>                                           alpha, 0, 1);
>> >>         }
>> >>         if( distribution == null || distribution.getDegreesOfFreedom()
>> !=
>> >> n-2){
>> >>           distribution = new TDistributionImpl(n - 2);
>> >> }
>> >>         return getSlopeStdErr() *
>> >>             distribution.inverseCumulativeProbability(1d - alpha / 2d);
>> >>     }
>> >>
>> >> Similarly getSignificance() would have to be modified with the check of
>> the
>> >> degrees of freedom of the distribution.
>> >>
>> >> There is nothing wrong with the current code, but making the change
>> means
>> >> faster updates.
>> > Slightly, yes.  There is not much code at all in the distribution
>> > constructor, but you are correct.  Moreover, I can see now that the
>> > "immutability-everywhere" changes in trunk have made the code in the
>> > class a little funny.  In versions <=2.2, the TDistribution used by
>> > the class was pluggable - i.e., there was a constructor that took at
>> > TDistribution as an argument, so if for some reason you wanted to
>> > use an impl different from TDistributionImpl, you could do that.
>> > There was also a setter for the distribution. In addition,
>> > TDistributionImpl itself was mutable, exposing a setDegreesOfFreedom
>> > method.  So the distribution member was set at construction time and
>> > the data update methods called the setter for DF on the
>> > distribution.  We decided to make the distributions immutable in 3.0
>> > (search the archives for discussion), so the current mods were done
>> > to basically accomplish that.  But the code should be cleaned up.
>> > The constructor taking an int is meaningless and should be
>> > deprecated or removed (unfortunately, we added that in 2.2 and
>> > advertised it as a deprecation replacement for the version that took
>> > a distribution as parameter.  We should have realized then that it
>> > was meaningless.  My bad for missing that. I would favor dropping it
>> > in 3.0, since even if anyone is using it, it isn't doing anything
>> > meaningful for them.)  Given that constructing a TDistributionImpl
>> > is trivial, we might as well eliminate the member field altogether
>> > and just create one when needed.  If you agree and no one else
>> > objects, I will make these changes.  Thanks for reviewing the code.
>>
>> Tracked as MATH-648, fixed in r1159918.
>>
>> Phil
>> >
>> > Phil
>> >> Thoughts?
>> >>
>> >> -Greg
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message