lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd VanderVeen <...@part.net>
Subject Re: Query Tuning
Date Mon, 21 Feb 2005 19:43:25 GMT
Runde, Kevin wrote:

>Hi All,
>
>How does Lucene handle multi term queries? Does it use short circuiting?
>So if a user entered:
>(a OR b) AND c
>But my program knew testing for "c" is cheaper than testing for "(a OR
>b)" and I rewrote the query as:
>c AND (a OR b)
>Would the query run faster?
>
>Sorry if this has already be answered, but for some reason the Archive
>search is not working for me today.
>
>Thanks,
>Kevin
>
>
>  
>
Not sure about what is in CVS, but look at BooleanQuery.scorer(). If all 
of the clauses of the BooleanQuery are required and none of the clauses 
are BooleanQueries a ConjunctionScorer is returned that offers the 
optimizations you seek. In the example you gave, there is a clause that 
is boolean ( a or b) that will have to be evaluated independently with a 
boolean scorer. This will be performed regardless of the ordering. 
(BooleanScorer doesn't preserve document order when it return results 
and hence it can't utilize the optimal algorithm provided by 
ConjuntionScorer). Others have been down this path as evidenced by the 
sigh in the javadoc.

If calculating (a or b) is expensive and the docFreq of a is much less 
than the union of a and b, you might consider rewriting it to (a and c) 
or (b and c) using DeMorgan's law. Expansion like this isn't always 
beneficial and can't be applied blindly. As far as I can tell there is 
no query planning/optimization aside from the merging of related clauses 
and attempts to rewrite to simpler queries.

Cheers,
Todd VanderVeen

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message