lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API
Date Fri, 23 Oct 2009 16:58:59 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12769295#action_12769295
] 

Michael McCandless commented on LUCENE-1997:
--------------------------------------------


32 bit 1.5 JRE:

JAVA:
java version "1.5.0_19"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_19-b02)
Java HotSpot(TM) Server VM (build 1.5.0_19-b02, mixed mode)


OS:
SunOS rhumba 5.11 snv_111b i86pc i386 i86pc Solaris

||Source||Seg size||Query||Tot hits||Sort||Top N||QPS old||QPS new||Pct change||
|wiki|log|1|318481|title|10|97.31|92.69|{color:red}-4.7%{color}|
|wiki|log|1|318481|title|25|96.74|92.09|{color:red}-4.8%{color}|
|wiki|log|1|318481|title|50|98.57|90.03|{color:red}-8.7%{color}|
|wiki|log|1|318481|title|100|97.20|103.72|{color:green}6.7%{color}|
|wiki|log|1|318481|title|500|84.14|78.23|{color:red}-7.0%{color}|
|wiki|log|1|318481|title|1000|77.84|63.62|{color:red}-18.3%{color}|
|wiki|log|<all>|1000000|title|10|114.99|136.86|{color:green}19.0%{color}|
|wiki|log|<all>|1000000|title|25|114.63|125.92|{color:green}9.8%{color}|
|wiki|log|<all>|1000000|title|50|113.33|130.58|{color:green}15.2%{color}|
|wiki|log|<all>|1000000|title|100|115.36|111.81|{color:red}-3.1%{color}|
|wiki|log|<all>|1000000|title|500|107.30|86.16|{color:red}-19.7%{color}|
|wiki|log|<all>|1000000|title|1000|98.07|55.39|{color:red}-43.5%{color}|
|random|log|<all>|1000000|rand string|10|115.55|140.86|{color:green}21.9%{color}|
|random|log|<all>|1000000|rand string|25|125.66|137.15|{color:green}9.1%{color}|
|random|log|<all>|1000000|rand string|50|123.58|133.82|{color:green}8.3%{color}|
|random|log|<all>|1000000|rand string|100|115.51|134.82|{color:green}16.7%{color}|
|random|log|<all>|1000000|rand string|500|102.73|93.24|{color:red}-9.2%{color}|
|random|log|<all>|1000000|rand string|1000|88.70|65.09|{color:red}-26.6%{color}|
|random|log|<all>|1000000|country|10|113.92|139.72|{color:green}22.6%{color}|
|random|log|<all>|1000000|country|25|113.44|131.36|{color:green}15.8%{color}|
|random|log|<all>|1000000|country|50|122.88|128.62|{color:green}4.7%{color}|
|random|log|<all>|1000000|country|100|121.88|135.58|{color:green}11.2%{color}|
|random|log|<all>|1000000|country|500|96.94|79.38|{color:red}-18.1%{color}|
|random|log|<all>|1000000|country|1000|82.01|62.31|{color:red}-24.0%{color}|
|random|log|<all>|1000000|rand int|10|124.58|134.20|{color:green}7.7%{color}|
|random|log|<all>|1000000|rand int|25|123.46|134.82|{color:green}9.2%{color}|
|random|log|<all>|1000000|rand int|50|117.96|128.61|{color:green}9.0%{color}|
|random|log|<all>|1000000|rand int|100|113.92|122.09|{color:green}7.2%{color}|
|random|log|<all>|1000000|rand int|500|105.49|38.92|{color:red}-63.1%{color}|
|random|log|<all>|1000000|rand int|1000|92.27|53.14|{color:red}-42.4%{color}|


> Explore performance of multi-PQ vs single-PQ sorting API
> --------------------------------------------------------
>
>                 Key: LUCENE-1997
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1997
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch
>
>
> Spinoff from recent "lucene 2.9 sorting algorithm" thread on java-dev,
> where a simpler (non-segment-based) comparator API is proposed that
> gathers results into multiple PQs (one per segment) and then merges
> them in the end.
> I started from John's multi-PQ code and worked it into
> contrib/benchmark so that we could run perf tests.  Then I generified
> the Python script I use for running search benchmarks (in
> contrib/benchmark/sortBench.py).
> The script first creates indexes with 1M docs (based on
> SortableSingleDocSource, and based on wikipedia, if available).  Then
> it runs various combinations:
>   * Index with 20 balanced segments vs index with the "normal" log
>     segment size
>   * Queries with different numbers of hits (only for wikipedia index)
>   * Different top N
>   * Different sorts (by title, for wikipedia, and by random string,
>     random int, and country for the random index)
> For each test, 7 search rounds are run and the best QPS is kept.  The
> script runs singlePQ then multiPQ, and records the resulting best QPS
> for each and produces table (in Jira format) as output.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message