lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API
Date Fri, 23 Oct 2009 06:03:59 GMT


Uwe Schindler commented on LUCENE-1997:

bq. Thats my impression too - Java 1.6 is mainly just a bug fix and performance release and
has been out for a while, so its usually the choice I've seen. Sounds like Uwe thinks its
more buggy though, so who knows if thats a good idea 

Because of this, for Lucene 3.0 we should say, it's a Java 1.5 compatible release. As Mark
said, Java 6 does not contain anything really new useable for Lucene, so we are fine with
staying on 1.5. If somebody wants to use 1.5 or 1.6 it's his choice, but we should not force
people to use 1.6. If at least one developer uses 1.5 for developing, we have no problem with
maybe some added functions in core classes we accidently use (like String.isEmpty() - which
is a common problem because it was added in 1.6 and many developers use it intuitive).

Even though 1.5 is EOLed by Sun, they recently added a new release 1.5.0_21. I was also wondering
about that, but it seems that Sun is still providing "support" for it.

About the stability: maybe it is better now, but I have seen so many crashed JVMs in the earlier
versions <= _12, so I stayed on 1.5. But we are also thinking of switching here at some

> Explore performance of multi-PQ vs single-PQ sorting API
> --------------------------------------------------------
>                 Key: LUCENE-1997
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-1997.patch, LUCENE-1997.patch
> Spinoff from recent "lucene 2.9 sorting algorithm" thread on java-dev,
> where a simpler (non-segment-based) comparator API is proposed that
> gathers results into multiple PQs (one per segment) and then merges
> them in the end.
> I started from John's multi-PQ code and worked it into
> contrib/benchmark so that we could run perf tests.  Then I generified
> the Python script I use for running search benchmarks (in
> contrib/benchmark/
> The script first creates indexes with 1M docs (based on
> SortableSingleDocSource, and based on wikipedia, if available).  Then
> it runs various combinations:
>   * Index with 20 balanced segments vs index with the "normal" log
>     segment size
>   * Queries with different numbers of hits (only for wikipedia index)
>   * Different top N
>   * Different sorts (by title, for wikipedia, and by random string,
>     random int, and country for the random index)
> For each test, 7 search rounds are run and the best QPS is kept.  The
> script runs singlePQ then multiPQ, and records the resulting best QPS
> for each and produces table (in Jira format) as output.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message