commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Neidhart (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-805) Percentile calculation is very slow when input data are constants
Date Wed, 18 Jul 2012 21:43:35 GMT

    [ https://issues.apache.org/jira/browse/MATH-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417717#comment-13417717
] 

Thomas Neidhart commented on MATH-805:
--------------------------------------

The evaluation of the percentile uses a partition method to sort the data around a pivot element
(element that are smaller are before the pivot element, and elements that are larger are behind
it).

Now in case of an array with lots of equal values, this can lead to some kind of staircase
situation, where the pivot element is just slowly increased from the bottom index and the
full array of values has to be evaluated in each iteration. The problem is in the partition
method of Percentile.java:

{noformat}
    while (i < j) {
       while ((i < j) && (work[j] >= value)) {
           --j;
       }
       while ((i < j) && (work[i] <= value)) {
           ++i;
       }
    ...
{noformat}

The comparisons here use >= and <= while it should be > and <.
                
> Percentile calculation is very slow when input data are constants
> -----------------------------------------------------------------
>
>                 Key: MATH-805
>                 URL: https://issues.apache.org/jira/browse/MATH-805
>             Project: Commons Math
>          Issue Type: Improvement
>    Affects Versions: 3.0
>            Reporter: Benoit de Rancourt
>            Assignee: Thomas Neidhart
>            Priority: Minor
>              Labels: performance, test
>
> I use the Percentile class to calculate quantile on a big array (10^6 entries). When
I have to test the performance of my code, I notice that the calculation of quantile is at
least 100x slower when my data are constants (10^6 of the same nomber). Maybe the Percentile
calculation can be improved for this special case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message