lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <simon.willna...@googlemail.com>
Subject Re: "AND" Query "under the hood" ?
Date Tue, 25 Oct 2011 12:58:18 GMT
On Tue, Oct 25, 2011 at 2:18 PM, sol myr <solmyr72@yahoo.com> wrote:
> Hi,
>
> Could I please ask another question regarding Lucene "under the hood" / performance.
>
> I wondered how "AND" queries are implemented?
> Say we query for "+hello +world".
> Would Lucene simply find 2 lists of documents ("documents containing HELLO",  and "documents
containing WORLD"),
>
> and then intersect them (yielding documents with both words)?
> Or does Lucene do smarter tricks?

this is basically what happens under the hood. On trunk we optimize BQ
that are composed from TermQueries. Lucene uses the Term with the
lowest frequency to lead the intersection. (LUCENE-3328) For other
queries we try predict the sparseness of a scorer / query to lead the
intersection.
>
>
> And in regards to performance, is there any importance to query order ( "+hello +world" 
as opposed to "+world +hello")?

there might be situations where our sparse predictions are failing so
there the order might matter, for termqueries on trunk it doesn't

simon
>
>
> Thanks :)
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message