lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Keegan <peterlkee...@gmail.com>
Subject Re: Lucene 2.9 RC2 now available for testing
Date Wed, 09 Sep 2009 14:18:11 GMT
>Is it possible that skipTo is very costly with your custom scorer?

It's no more expensive than 'next'. The scorer's 'skipTo' and 'next' methods
call  termdocs.skipTo or termdocs.next to get the next 'candidate' doc. This
just checks a BitVector to find the next non-deleted doc. But the scorer
must visit some metadata (RAM resident) for each candidate doc, which is the
most expensive part. So, reducing the number of calls to the scorer's 'next'
and 'skipTo' methods is always a win.  Since the custom scorer is the
bottleneck with both versions of Lucene (according to JProfiler), the
improvements here are going weigh much more.

Peter


On Wed, Sep 9, 2009 at 9:57 AM, Michael McCandless <
lucene@mikemccandless.com> wrote:

> Right, BooleanQuery will now try to use BooleanScorer (does "out of
> order" collection, which does not use skipTo/advance at all, I think)
> when possible, instead of BooleanScorer2.
>
> This only applies for boolean queries that have only SHOULD clauses,
> and up to 32 MUST_NOT clauses (if there's even 1 MUST clause,
> BooleanScorer2 will be used).
>
> But: it's interesting that you see such gains (2x-10x) from this.  In
> simple (clauses that were TermQuery) boolean queries I saw maybe ~30%
> speedup, I think.
>
> Is it possible that skipTo is very costly with your custom scorer?
> That could explain the gains.
>
> Mike
>
> On Wed, Sep 9, 2009 at 9:44 AM, Mark Miller<markrmiller@gmail.com> wrote:
> > How about the new score inorder/out of order stuff? It was an option
> > before, but I think now it uses whats best by default? And pairs with
> > the collector? I didn't follow any of that closely though.
> >
> > - Mark
> >
> > Peter Keegan wrote:
> >> IndexSearcher.search is calling my custom scorer's 'next' and 'doc'
> methods
> >> 64% fewer times. I see no 'advance' method in any of the hot spots'. I
> am
> >> getting the same number of hits from the custom scorer.
> >> Has the BooleanScorer2 logic changed?
> >>
> >> Peter
> >>
> >> On Wed, Sep 9, 2009 at 9:17 AM, Yonik Seeley <
> >> yonik.seeley@lucidimagination.com> wrote:
> >>
> >>
> >>> On Wed, Sep 9, 2009 at 8:57 AM, Peter Keegan<peterlkeegan@gmail.com>
> >>> wrote:
> >>>
> >>>> Using JProfiler, I observe that the improvement
> >>>> is due to a huge reduction in the number of calls to TermDocs.next and
> >>>> TermDocs.skipTo (about 65% fewer calls).
> >>>>
> >>> Indexes are searched per-segment now (i.e. MultiTermDocs isn't normally
> >>> used).
> >>> Off the top of my head, I'm not sure how this can lead to fewer
> >>> TermDocs.skipTo() calls though.  Are you sure you weren't also
> >>> counting Scorer.skipTo()... which would now be Scorer.advance()?
> >>> Have you verified that your custom scorer is working correctly with
> >>> 2.9 and that you're getting the same number of hits on the overall
> >>> query as you were with previous versions?
> >>>
> >>> -Yonik
> >>> http://www.lucidimagination.com
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>
> >>>
> >>
> >>
> >
> >
> > --
> > - Mark
> >
> > http://www.lucidimagination.com
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message