lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <>
Subject Re: Filter and query precedence, boolean query
Date Sun, 23 Oct 2011 20:08:12 GMT
hey josh,

On Sun, Oct 23, 2011 at 5:39 PM, Josh Devins <> wrote:
> Hi folks,
> I'm hoping someone can shed some light on how filters and boolean queries
> work under the hood. As I understand it, the following two queries are
> functionally equivalent:
> boolean must, term query: foo, boolean must, term query: bar
> term query: foo, term filter: bar

their result set is the same while if you score you might get different scores.

> What I'd like to understand is:
> 1) How are boolean queries run by Lucene? Are both queries (term query: foo,
> term query: bar) run and then set operation intersection performed to find
> the final document set? Or is it a staged query where term query: foo runs
> first, then term query: bar run on the subset returned from the first query
> for foo?

Lucene does document a time retrieval so both TermQueries are
evaluated at the same time. The BooleanScorer will advance both
TermQueries until it finds a document containing both terms etc.
> 2) When running the above query+filter, which is run first? Specifically, if
> documents with the term 'foo' are an order of magnitude larger than the
> documents with the term 'bar', should they be swapped in the above query so
> that the results of the query are as small as possible before running the
> filter. Or does the query run against the results of the filter?

on the lucene level if you specify a filter the filters DocIdSet is
pulled before the query is executed. However, it depends on the impl.
if the set is build ahead of time or during evaluation. Once the
filter is created we use a leapfrog approach meaning that initially we
advance both the filter and the query to their first doc, if the docs
match we score the doc, if the filters doc is greater than the queries
doc the query is advance to the next doc greater or equal to the last
filtered doc otherwise the advance is swapped. if you use something
like QueryWrapperFilter (a filter created from a query) the query to
build the filter runs first. This applies to Lucene 3.x in 4.0 we are
currently changing how fitlers work though.

> Hopefully this make sense :)

same here :)

> Thanks,
> Josh

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message