lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Ronge <mro...@theronge.com>
Subject Re: Pre-filtering for expensive query
Date Sat, 30 Aug 2008 16:22:50 GMT

On Aug 30, 2008, at 6:13 AM, Paul Elschot wrote:

> Op Saturday 30 August 2008 03:34:01 schreef Matt Ronge:
>> Hi all,
>>
>> I am working on implementing a new Query, Weight and Scorer that is
>> expensive to run. I'd like to limit the number of documents I run
>> this query on by first building a candidate set of documents with a
>> boolean query. Once I have that candidate set, I was hoping I could
>> build a filter off of it, and issue that along with my expensive
>> query. However, after reading the code I see that filtering is done
>> during the search, and not before hand.
>
> Correct. I suppose you mean the filtering code in IndexSearcher?

Yes, that's exactly what I mean.

>
>> So my initial boolean query
>> won't help in limiting the number of documents scored by my expensive
>> query.
>
> The trick of filtering is the use of skipTo() on both the filter and
> the scorer to skip superfluous work as much as possible.
> So when you make your scorer implement skipTo() efficiently,
> filtering it should reduce the amount of scoring done.
>
> Implementing skipTo() efficiently is normally done by using
> TermScorer.skipTo() on the leafs of a scorer structure. So,
> in case you implement your own TermScorer, take a serious
> look at TermScorer.skipTo().
>
> Normally, score value computations are not the bottleneck,
> but accessing the index is, and this is where skipTo() does
> the real work. At the moment avoiding score value computations
> is a nice extra.

I was not aware of this. Where can I find the code that uses the  
filter to determine what values to feed to skipTo (I'm trying to get a  
better understand of the Lucene source)?

>
>
>> Or should I just implement something myself in a custom scorer?
>
> In case you have a better way than skipTo(), or something
> to improve on this issue to allow a Filter as clause to BooleanQuery:
> https://issues.apache.org/jira/browse/LUCENE-1345
> let us know.


Thanks, if the skipTo approach doesn't work, I'll take a look at this.

--
Matt

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message