lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Pre-filtering for expensive query
Date Sat, 30 Aug 2008 20:01:16 GMT
Op Saturday 30 August 2008 18:19:09 schreef Matt Ronge:
> On Aug 30, 2008, at 4:43 AM, Karl Wettin wrote:
> > Can you tell us a bit more about what you custom query does?
> > Perhaps you can build the "candidate filter" and reuse it over and
> > over again?
>
> I cannot reuse it. The candidate filter would be constructed by first
> running a boolean query with a number of SHOULD clauses. So then I
> know what docs atleast contain the terms I'm looking for. Once I have
> this set, I will look at the ordering of the matches (it's a bit more
> sophisticated than just a phrase query) and find the final matches.

Sounds like you may want to take a look at SpanNearQuery.

> Since my boolean clauses are different for each query I can't reuse
> the filter.

With (a variation of) SpanNearQuery you may end up not needing
any filtering at all, because it already uses skipTo() where possible.

In case you are looking for documents that contain partial phrases
from an input query that has more than 2 words, have a look at Nutch.

Regards,
Paul Elschot


>
>
> --
> Matt
>
> >> Hi all,
> >>
> >> I am working on implementing a new Query, Weight and Scorer that
> >> is expensive to run. I'd like to limit the number of documents I
> >> run this query on by first building a candidate set of documents
> >> with a boolean query. Once I have that candidate set, I was hoping
> >> I could build a filter off of it, and issue that along with my
> >> expensive query. However, after reading the code I see that
> >> filtering is done during the search, and not before hand. So my
> >> initial boolean query won't help in limiting the number of
> >> documents scored by my expensive query.
> >>
> >> Has anyone done any work into restricting the set of docs that a
> >> query operates on?
> >> Or should I just implement something myself in a custom scorer?
> >>
> >> Thanks in advance,
> >> --
> >> Matt Ronge
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message