lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: lucene 2.9 sorting algorithm
Date Thu, 15 Oct 2009 17:06:06 GMT
On Thu, Oct 15, 2009 at 9:12 AM, Yonik Seeley <yonik@lucidimagination.com>wrote:

> On Thu, Oct 15, 2009 at 11:53 AM, Yonik Seeley
> <yonik@lucidimagination.com> wrote:
> > And it seems like a PQ per segment simply delays many of the slow
> > lookups until the end where the PQs must be merged.
>
> Actually, I'm wrong about that part - one can simply merge on
> values... there will be lots of string comparisons (and a new merge
> queue) but no binary searches.
>

The number of string comparisons in the final merge of queues is
O(K log(M) ) if you do as you're describing (have basically a PQ of sorted
lists, the same way BooleanScorer works), where K is going to be roughly
what, 10, 100 (numResultsToReturn), and M is 10-50?  This is like a couple
thousand string compares, tops, and independent of the overall size of the
segments - it's measured in microseconds of CPU time.

  -jake

Mime
View raw message