commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-578) Decrease DescriptiveStatistics performance from 2.0 to 2.2
Date Mon, 16 May 2011 15:35:47 GMT

    [ https://issues.apache.org/jira/browse/MATH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034061#comment-13034061
] 

Phil Steitz commented on MATH-578:
----------------------------------

Thanks for reporting this.  I assume the timings include the percentile calculation, right?


This could be related to the changes in the Percentile implementation in 2.2. If isolating
the timing to just the percentile calculation shows that is where the latency difference is,
we should reopen MATH-417.  The changes there were to improve Percentile performance, which
in most cases they do.  The first two results above are disturbing, however.  If your data
is largely constant and this creates a problem in your application, as a workaround, you can
provide an alternative Percentile implementation to DescriptiveStatistics using setPercentileImpl.

> Decrease DescriptiveStatistics performance from 2.0 to 2.2
> ----------------------------------------------------------
>
>                 Key: MATH-578
>                 URL: https://issues.apache.org/jira/browse/MATH-578
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.2
>         Environment: Linux
>            Reporter: Paolo Repele
>            Assignee: Mikkel Meyer Andersen
>            Priority: Minor
>
> Switching between commons-math 2.0 to 2.2 we note how the
> DescriptiveStatistics.addValue(double) has decrease the performance.
> I tested with 2 million values.
> DescriptiveStatistics ds = new DescriptiveStatistics();
> for(int i = 0; i<1000*1000*2; i++) { //2 million values
>     ds.addValue(v);
> }
> ds.getPercentile(50);
> Seems that depending by the values inserted in the DescriptiveStatistics it takes different
time:
> * with a single value (0)
> ** 2.0 -> take ~500 ms
> ** 2.2 -> take more than 10 minutes
> * with 50% fixed value (0) and 50% Math.random()
> ** 2.0 -> take ~500 ms
> ** 2.2 -> take ~250000 ms -> ~250 second
> * with 100% Math.random()
> ** 2.0 -> take ~500 ms
> ** 2.2 -> take ~70 ms

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message