commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bradford Cross" <bradford.n.cr...@gmail.com>
Subject Re: [Math] rolling calculations with lag
Date Thu, 25 Oct 2007 05:01:41 GMT
New patch for rolling stats is attached to ticket.  Notes are included.
Please review and get back to me.

On 10/13/07, Bradford Cross <bradford.n.cross@gmail.com> wrote:
>
> I am fixing these issues tonight and finishing up the ranking statistics.
>
>
> I still need to optimize the sorted deque structure that I created for
> RollingPercentile that uses a linked list and hte ResizeableDoubleArray.  I
> will create a patch thhat includes the first version while I work an a more
> performant and less hackish implimentaiton of htis data structure - when I
> submit the patch please have a look and give me any suggestions.
>
> I notice that the Moment statistics, FirstMoment, SecondMoment and their
> wrappers Mean, Variance, etc. all extend from
> AbstractStatelessUnivariateStatistic and generally use the increment()
> calcualtion to produce the value returned by the evaluate() function.  This
> is right in line with the abstractions I am talking about in my last email -
> we have a few different kinds of calculations; static (pass in an array, get
> back a value), incremental-accumulating (keep passing in values, keep
> updating calcualtion), incremental-rolling (keep passing in values, removing
> data outside the lag window, and updating calculation.)  I propose to
> continue the abstractions in this direction, maintain backward
> compatability, and change increment() from void to double return type as I
> mention in the previous email.
>
> In the case of ranking statistics, there are currently no impliemntations
> for incrent() - I propose to add those.  Also, while I was working on the
> rank statistics tonight I realized that we have two different cases for
> incremental rank statistics. 1) You want to use Percentile and specify a
> quantile in the constructor, you are interested in repeated calls to
> increment() for the same quantile ( f.ex. min, max, median.)  2) You want
> to use Perceltile with incremental updates but without specifying a quantile
> in the constructor, by executing repeated calls on increment() and calling
> getValue with a quantile argument; this is useful if you need, f.ex. both
> min and max of the same data, there is no need to create multiple instances
> of Percentile.  This si implimented in the current Percentile with
> setQuantile() ... again we can keep taht for backward compatability and just
> have a convienience method for getValue(double quantile).
>
> Thoughts?
>
>
>
> On 10/12/07, Bradford Cross <bradford.n.cross@gmail.com> wrote:
> >
> >
> >
> > On 10/12/07, Phil Steitz <phil.steitz@gmail.com> wrote:
> > >
> > > Thanks!
> > >
> > > Had a quick look and have a couple of comments.
> > >
> > > First, have a look at the developers guide (linked on the [math]
> > > website) for style and other guidlines and if you are not set up yet
> > > with maven, let me know if you  need help getting set up so you can
> > > get checkstyle and pmd reports.
> >
> >
> > >>>Sure.
> >
> > We should probably talk a little about the API - i.e., what the method
> > > names actually mean - and this all needs to be specified in the method
> > > javadoc.  Commons math is as much about the API as the
> > > implementations, so we try to be pretty careful about getting the
> > > names and semantics right.  Like other commons components, we have an
> > > implicit commitment to maintain backward compatibility, so that makes
> > > it even more important to get the APIs right.  So can you describe
> > > what exactly the public methods mean and why they are named as they
> > > are?
> >
> >
> > >>>I agree.  calculate() can be renamed to increment() for consistency
> > with the rest of the code base.  I also suggest that all increment() methods
> > return the double value of the incremental calculation rather than void,
> > which forces the client code to do statistic.increment(), and
> > statistic.getResult() - the change doesn't violate backward
> > compatibility and makes client code use of the API simpler and more
> > intuitive.  Also note that evaluate() methods return double, so this change
> > would make calculation methods consistent.  On this note, I also think it is
> > better to decompose each method of calculation into a different object.
> > There is static calculation on a collection via evaluate, incremental
> > accumulated calculate, and incremental rolling calculate.  This way we can
> > simplify the main statistic classes by making them decorators, keep
> > backwards compatibility in the API for all the statistics, and delegate the
> > three different calculation types to specialized objects rather than
> > cluttering things up.  Maybe there are also better names ... like
> > staticEvaluate,  accumulatedEvaluate,  rollingEvaluate..or likewise for
> > ___Calculate ...bBut that would violate backward compatibility.
> >
> > One more patch-generation point.  I notice that the patch includes
> > > some variable name changes and other stylistic changes to existing
> > > code.  While there is nothing wrong with suggesting this kind of
> > > change, we try to separate the style / formatting changes from the
> > > changes that introduce new features or fix bugs.  That makes the diffs
> > > and change logs easier to read.  So it would be good to remove those
> > > changes from the patch.
> > >
> > > Checkstyle will flag this, but two other little things I noticed are
> > > the presence of tabs (we use spaces in place of tabs) and if - then -
> > > else with no braces (we like braces).
> >
> >
> >
> > >>>No problem, I will make these changes, add javadoc and resubmit the
> > patch.
> >
> > Thanks again for your interest and contributions!
> > >
> > > Phil
> > >
> > > On 10/11/07, Bradford Cross <bradford.n.cross@gmail.com > wrote:
> > > > Cool - first patch finally submitted. :-)
> > > >
> > > > On 10/6/07, Phil Steitz <phil.steitz@gmail.com> wrote:
> > > > >
> > > > > On 10/3/07, Bradford Cross < bradford.n.cross@gmail.com> wrote:
> > > > > > OK, I have created a patch...I tried to follow the instructions
> > > to file
> > > > > a
> > > > > > bug on bugzilla but i can't seem to find the right place to
file
> > > a new
> > > > > bug
> > > > > > to either commons or commons math.
> > > > > >
> > > > > > I wonder if someone could help me out.
> > > > > >
> > > > >
> > > > > Sorry for the response latency and sorry if you were led to
> > > Bugzilla
> > > > > by the incorrect link that I just noticed on
> > > > > http://commons.apache.org/math/developers.html .  I will fix that.
> > > > >
> > > > > We now use Jira for issue tracking / patch submission.  Here is a
> > > link
> > > > > to the commons math issues page:
> > > > >
> > > > > http://commons.apache.org/math/issue-tracking.html
> > > > >
> > > > > Phil
> > > > > >
> > > > >
> > > > >
> > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > > > For additional commands, e-mail: dev-help@commons.apache.org
> > > > >
> > > > >
> > > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message