[ https://issues.apache.org/jira/browse/MATH1169?page=com.atlassian.jira.plugin.system.issuetabpanels:commenttabpanel&focusedCommentId=14211419#comment14211419
]
Adam Stelmaszczyk edited comment on MATH1169 at 11/13/14 11:05 PM:

{{Main.java}} should be run, it needs Caliper JAR: https://code.google.com/p/caliper/downloads/detail?name=caliper1.0beta1all.jar
Every call to evaluate() I'm creating a new double[] array filled with different random numbers.
{{FloydRivest.java}} contatins implementation of algorithm basing on pseudocode from its Wikipedia
page.
{{PercentileFloydRivest.java}} is Percentile.java with modified evaluate(), so it calls FloydRivest.select()
instead of typical select(). pivotsHeap was removed for simplicity (maybe it can improve efficiency
even more).
was (Author: adams):
Main class should be run, it needs Caliper JAR: https://code.google.com/p/caliper/downloads/detail?name=caliper1.0beta1all.jar
Every call to evaluate() I'm creating a new double[] array filled with different random numbers.
FloydRivest contatins implementation of algorithm basing on pseudocode from its Wikipedia
page.
PercentileFloydRivest.java is Percentile.java with modified evaluate(), so it calls FloydRivest.select()
instead of typical select(). pivotsHeap was removed for simplicity (maybe it can improve efficiency
even more).
> Faster Percentile
> 
>
> Key: MATH1169
> URL: https://issues.apache.org/jira/browse/MATH1169
> Project: Commons Math
> Issue Type: Improvement
> Reporter: Adam Stelmaszczyk
> Priority: Minor
> Attachments: FloydRivest.java, Main.java, PercentileFloydRivest.java
>
>
> Percentile class is now using Hoare selection algorithm (aka [Quickselecthttp://en.wikipedia.org/wiki/Quickselect])
with median of 3 for choosing a pivot and caching pivots. Quickselect expected runtime is
about 3.4N + O(N).
> The constant can be improved to 1.5n by different pivot strategy involving sampling,
yielding the [Floydâ€“Rivest algorithmhttp://en.wikipedia.org/wiki/Floyd%E2%80%93Rivest_algorithm],
which has average complexity of 1.5N + O(N) for median, with other k being faster.
> I implemented Floydâ€“Rivest algorithm without caching pivots following Wikipedia and
benchmarked it with [Caliperhttps://code.google.com/p/caliper/].
> Array size  runtime in seconds for FloydRivest  runtime for current Percentile  %
faster
> 100000  6.907  7.083  2.5% [linkhttps://microbenchmarks.appspot.com/runs/a0d947ee57fc4636b687b4bc5170f1d7#r:scenario.benchmarkSpec.methodName]
> 1000000  70.3  75.4  6.8% [linkhttps://microbenchmarks.appspot.com/runs/77291c2e6dbb4666bf37c174e4b53f2e#r:scenario.benchmarkSpec.methodName]
> 10000000  708  868  18.4% [linkhttps://microbenchmarks.appspot.com/runs/c0f65e3544c0458eb6e0634b4e6fa68b#r:scenario.benchmarkSpec.methodName]

This message was sent by Atlassian JIRA
(v6.3.4#6332)
