lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (LUCENE-1997) Explore performance of multi-PQ vs single-PQ sorting API
Date Tue, 03 Nov 2009 00:48:59 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12772792#action_12772792
] 

Mark Miller edited comment on LUCENE-1997 at 11/3/09 12:48 AM:
---------------------------------------------------------------

bq. 100th page at the same time index is at 100 segments? How many very's would you give it?

I'm not claiming 100th page with many segments - I have no info on that, and I agree it would
be more rare. But it has come to my attention that 100th page is more common than I would
have thought. (sorry - I wasn't very clear on that in my last comment - I am just referring
to the deep paging - I previously would have thought its more rare than I do now - though
even before, its something I wouldnt want to see a huge perf drop on)

In any case - no one is saying this change won't happen. Just that its not likely to happen
soon.

*edit*

Let me answer the question though - based on my experience with the mergefactors people like
to use, and the cost of optimizing, I would say 100 segments deserves no very. At best, it
might be semi rare. Mixed with the 100 page req, I'd take it to rare. But thats just me guessing
based on my Lucene/Solr experience - so its not worth a whole ton.

      was (Author: markrmiller@gmail.com):
    bq. 100th page at the same time index is at 100 segments? How many very's would you give
it?

I'm not claiming 100th page with many segments - I have no info on that, and I agree it would
be more rare. But it has come to my attention that 100th page is more common than I would
have thought. (sorry - I wasn't very clear on that in my last comment - I am just referring
to the deep paging - I previously would have thought its more rare than I do now - though
even before, its something I wouldnt want to see a huge perf drop on)

In any case - no one is saying this change won't happen. Just that its not likely to happen
soon.
  
> Explore performance of multi-PQ vs single-PQ sorting API
> --------------------------------------------------------
>
>                 Key: LUCENE-1997
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1997
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>         Attachments: LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch,
LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch, LUCENE-1997.patch
>
>
> Spinoff from recent "lucene 2.9 sorting algorithm" thread on java-dev,
> where a simpler (non-segment-based) comparator API is proposed that
> gathers results into multiple PQs (one per segment) and then merges
> them in the end.
> I started from John's multi-PQ code and worked it into
> contrib/benchmark so that we could run perf tests.  Then I generified
> the Python script I use for running search benchmarks (in
> contrib/benchmark/sortBench.py).
> The script first creates indexes with 1M docs (based on
> SortableSingleDocSource, and based on wikipedia, if available).  Then
> it runs various combinations:
>   * Index with 20 balanced segments vs index with the "normal" log
>     segment size
>   * Queries with different numbers of hits (only for wikipedia index)
>   * Different top N
>   * Different sorts (by title, for wikipedia, and by random string,
>     random int, and country for the random index)
> For each test, 7 search rounds are run and the best QPS is kept.  The
> script runs singlePQ then multiPQ, and records the resulting best QPS
> for each and produces table (in Jira format) as output.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message