lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: search performance enhancement
Date Fri, 19 Aug 2005 18:17:16 GMT
On Friday 19 August 2005 18:09, John Wang wrote:
> Hi Paul:
> 
>       Thanks for the pointer.
> 
>        How would I extend from the patch you submitted to filter out
> more documents not using a Filter. e.g.
> 
>       have a class to skip documents based on a docID: boolean
> isValid(int docID)
> 
>       My problem is I want to discard documents at query time without
> having to construct a BitSet via filter. I have my own memory
> structure to help me skip documents based the query and a docid.

Basically, you need to implement the next() and skipTo(int targetDoc)
methods from Scorer on your memory structure. They are somewhat
redundant (skipTo(doc() + 1) has the same semantics as next(), except
initially), but that is for a mix of historical and performance reasons.
Have a look at how this is done in the posted FilteredQuery class
for SortedVIntList and BitSet.
With only isValid(docId) these next() and skipTo() methods would
have to count over the document numbers, which is less than ideal.

When you use the posted code, iirc it is only necessary to implement the
SkipFilter interface on your memory structure. One can use that interface
to build/cache  such memory structures using an IndexReader, and
from there the DocNrSkipper interface will do the rest (of the top of
my head).
One slight problem with the current Lucene implementation is that
java.lang.BitSet is not interface.

Regards,
Paul Elschot.

> Thanks
> 
> -John
> 
> On 8/16/05, Paul Elschot <paul.elschot@xs4all.nl> wrote:
> > Hi John,
> > 
> > On Wednesday 17 August 2005 04:46, John Wang wrote:
> > > Hi:
> > >
> > >    I posted a bug (36147) a few days ago and didn't hear anything, so
> > > I thought I'd try my luck on this list.
> > >
> > >    The idea is to avoid score calculations on documents to be filtered
> > > out anyway. (e.g. via Filter object passed to the searcher class)
> > >
> > >    This seems to be an easy change.
> > 
> > Have a look here:
> > http://issues.apache.org/bugzilla/show_bug.cgi?id=32965
> > 
> > >    Also it would be nice to expose a method to return a score given a
> > > docid, e.g.
> > >
> > >    float getScore(int docid)
> > >
> > >    on the Scorer class.
> > 
> > skipTo(int docid) and score() will do that.
> > 
> > >    I am gonna make the change locally and do some performance analysis
> > > on it and will post some numbers later.
> > 
> > The default score computations are mostly table lookups, and pretty fast.
> > So, unless you have large indexes, this will probably not make
> > much difference, but any performance improvement is wellcome.
> > In larger indexes, it helps to use skipTo() while searching.
> > 
> > Regards,
> > Paul Elschot
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message