commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Stelmaszczyk (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MATH-1169) Faster Percentile
Date Thu, 13 Nov 2014 23:00:33 GMT

     [ https://issues.apache.org/jira/browse/MATH-1169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adam Stelmaszczyk updated MATH-1169:
------------------------------------
    Description: 
Percentile class is now using Hoare selection algorithm (aka Quickselect) with median of 3
for choosing a pivot and caching pivots. Quickselect expected runtime is about 3.4N + O(N).

The constant can be improved to 1.5n by different pivot strategy involving sampling, yielding
the Floyd–Rivest algorithm, which has average complexity of 1.5N + O(N) for median, with
other k being faster.

I implemented Floyd–Rivest algorithm without caching pivots following wikipedia and benchmarked
it with Caliper (https://code.google.com/p/caliper/).

Array size - runtime in seconds for Floyd-Rivest - runtime for current Percentile - % faster
100000 - 6.907 - 7.083 - 2.5% [link|https://microbenchmarks.appspot.com/runs/a0d947ee-57fc-4636-b687-b4bc5170f1d7#r:scenario.benchmarkSpec.methodName]
1000000 - 70.3 - 75.4 - 6.8% [link|https://microbenchmarks.appspot.com/runs/77291c2e-6dbb-4666-bf37-c174e4b53f2e#r:scenario.benchmarkSpec.methodName]
10000000 - 708 - 868 - 18.4% [link|https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName]

  was:
Percentile class is now using Hoare selection algorithm (aka Quickselect) with median of 3
for choosing a pivot and caching pivots. Quickselect expected runtime is about 3.4N + O(N).

The constant can be improved to 1.5n by a bit complicated pivot strategy, yielding the Floyd–Rivest
algorithm, which has average complexity of 1.5N + O(N) for median, with other k being faster.

I implemented Floyd–Rivest algorithm without caching pivots following wikipedia and benchmarked
it with Caliper (https://code.google.com/p/caliper/).

Array size - runtime in seconds for Floyd-Rivest - runtime for current Percentile - % faster
100000 - 6.907 - 7.083 - 2.5% [link|https://microbenchmarks.appspot.com/runs/a0d947ee-57fc-4636-b687-b4bc5170f1d7#r:scenario.benchmarkSpec.methodName]
1000000 - 70.3 - 75.4 - 6.8% [link|https://microbenchmarks.appspot.com/runs/77291c2e-6dbb-4666-bf37-c174e4b53f2e#r:scenario.benchmarkSpec.methodName]
10000000 - 708 - 868 - 18.4% [link|https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName]


> Faster Percentile
> -----------------
>
>                 Key: MATH-1169
>                 URL: https://issues.apache.org/jira/browse/MATH-1169
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Adam Stelmaszczyk
>            Priority: Minor
>         Attachments: FloydRivest.java, Main.java, PercentileFloydRivest.java
>
>
> Percentile class is now using Hoare selection algorithm (aka Quickselect) with median
of 3 for choosing a pivot and caching pivots. Quickselect expected runtime is about 3.4N +
O(N).
> The constant can be improved to 1.5n by different pivot strategy involving sampling,
yielding the Floyd–Rivest algorithm, which has average complexity of 1.5N + O(N) for median,
with other k being faster.
> I implemented Floyd–Rivest algorithm without caching pivots following wikipedia and
benchmarked it with Caliper (https://code.google.com/p/caliper/).
> Array size - runtime in seconds for Floyd-Rivest - runtime for current Percentile - %
faster
> 100000 - 6.907 - 7.083 - 2.5% [link|https://microbenchmarks.appspot.com/runs/a0d947ee-57fc-4636-b687-b4bc5170f1d7#r:scenario.benchmarkSpec.methodName]
> 1000000 - 70.3 - 75.4 - 6.8% [link|https://microbenchmarks.appspot.com/runs/77291c2e-6dbb-4666-bf37-c174e4b53f2e#r:scenario.benchmarkSpec.methodName]
> 10000000 - 708 - 868 - 18.4% [link|https://microbenchmarks.appspot.com/runs/c0f65e35-44c0-458e-b6e0-634b4e6fa68b#r:scenario.benchmarkSpec.methodName]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message