commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bradford Cross" <bradford.n.cr...@gmail.com>
Subject Re: [Math] rolling calculations with lag
Date Sun, 14 Oct 2007 04:49:33 GMT
I am fixing these issues tonight and finishing up the ranking statistics.

I still need to optimize the sorted deque structure that I created for
RollingPercentile that uses a linked list and hte ResizeableDoubleArray.  I
will create a patch thhat includes the first version while I work an a more
performant and less hackish implimentaiton of htis data structure - when I
submit the patch please have a look and give me any suggestions.

I notice that the Moment statistics, FirstMoment, SecondMoment and their
wrappers Mean, Variance, etc. all extend from
AbstractStatelessUnivariateStatistic and generally use the increment()
calcualtion to produce the value returned by the evaluate() function.  This
is right in line with the abstractions I am talking about in my last email -
we have a few different kinds of calculations; static (pass in an array, get
back a value), incremental-accumulating (keep passing in values, keep
updating calcualtion), incremental-rolling (keep passing in values, removing
data outside the lag window, and updating calculation.)  I propose to
continue the abstractions in this direction, maintain backward
compatability, and change increment() from void to double return type as I
mention in the previous email.

In the case of ranking statistics, there are currently no impliemntations
for incrent() - I propose to add those.  Also, while I was working on the
rank statistics tonight I realized that we have two different cases for
incremental rank statistics. 1) You want to use Percentile and specify a
quantile in the constructor, you are interested in repeated calls to
increment() for the same quantile (f.ex. min, max, median.)  2) You want to
use Perceltile with incremental updates but without specifying a quantile in
the constructor, by executing repeated calls on increment() and calling
getValue with a quantile argument; this is useful if you need, f.ex. both
min and max of the same data, there is no need to create multiple instances
of Percentile.  This si implimented in the current Percentile with
setQuantile() ... again we can keep taht for backward compatability and just
have a convienience method for getValue(double quantile).

Thoughts?



On 10/12/07, Bradford Cross <bradford.n.cross@gmail.com> wrote:
>
>
>
> On 10/12/07, Phil Steitz <phil.steitz@gmail.com> wrote:
> >
> > Thanks!
> >
> > Had a quick look and have a couple of comments.
> >
> > First, have a look at the developers guide (linked on the [math]
> > website) for style and other guidlines and if you are not set up yet
> > with maven, let me know if you  need help getting set up so you can
> > get checkstyle and pmd reports.
>
>
> >>>Sure.
>
> We should probably talk a little about the API - i.e., what the method
> > names actually mean - and this all needs to be specified in the method
> > javadoc.  Commons math is as much about the API as the
> > implementations, so we try to be pretty careful about getting the
> > names and semantics right.  Like other commons components, we have an
> > implicit commitment to maintain backward compatibility, so that makes
> > it even more important to get the APIs right.  So can you describe
> > what exactly the public methods mean and why they are named as they
> > are?
>
>
> >>>I agree.  calculate() can be renamed to increment() for consistency
> with the rest of the code base.  I also suggest that all increment() methods
> return the double value of the incremental calculation rather than void,
> which forces the client code to do statistic.increment(), and
> statistic.getResult() - the change doesn't violate backward compatibility
> and makes client code use of the API simpler and more intuitive.  Also note
> that evaluate() methods return double, so this change would make calculation
> methods consistent.  On this note, I also think it is better to decompose
> each method of calculation into a different object.  There is static
> calculation on a collection via evaluate, incremental accumulated calculate,
> and incremental rolling calculate.  This way we can simplify the main
> statistic classes by making them decorators, keep backwards compatibility in
> the API for all the statistics, and delegate the three different calculation
> types to specialized objects rather than cluttering things up.  Maybe there
> are also better names ... like staticEvaluate,  accumulatedEvaluate,
> rollingEvaluate..or likewise for ___Calculate ...bBut that would violate
> backward compatibility.
>
> One more patch-generation point.  I notice that the patch includes
> > some variable name changes and other stylistic changes to existing
> > code.  While there is nothing wrong with suggesting this kind of
> > change, we try to separate the style / formatting changes from the
> > changes that introduce new features or fix bugs.  That makes the diffs
> > and change logs easier to read.  So it would be good to remove those
> > changes from the patch.
> >
> > Checkstyle will flag this, but two other little things I noticed are
> > the presence of tabs (we use spaces in place of tabs) and if - then -
> > else with no braces (we like braces).
>
>
>
> >>>No problem, I will make these changes, add javadoc and resubmit the
> patch.
>
> Thanks again for your interest and contributions!
> >
> > Phil
> >
> > On 10/11/07, Bradford Cross <bradford.n.cross@gmail.com> wrote:
> > > Cool - first patch finally submitted. :-)
> > >
> > > On 10/6/07, Phil Steitz <phil.steitz@gmail.com> wrote:
> > > >
> > > > On 10/3/07, Bradford Cross < bradford.n.cross@gmail.com> wrote:
> > > > > OK, I have created a patch...I tried to follow the instructions to
> > file
> > > > a
> > > > > bug on bugzilla but i can't seem to find the right place to file
a
> > new
> > > > bug
> > > > > to either commons or commons math.
> > > > >
> > > > > I wonder if someone could help me out.
> > > > >
> > > >
> > > > Sorry for the response latency and sorry if you were led to Bugzilla
> >
> > > > by the incorrect link that I just noticed on
> > > > http://commons.apache.org/math/developers.html.  I will fix that.
> > > >
> > > > We now use Jira for issue tracking / patch submission.  Here is a
> > link
> > > > to the commons math issues page:
> > > >
> > > > http://commons.apache.org/math/issue-tracking.html
> > > >
> > > > Phil
> > > > >
> > > >
> > > >
> > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > > For additional commands, e-mail: dev-help@commons.apache.org
> > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message