lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Using filters to speed up queries
Date Sat, 23 Oct 2010 23:19:09 GMT
Yes it has some heuristics on which query "drives" the execution. So the
query on which the first hit has a larger docid is the driving one. The
other one then only gets seeked to.

 

With filters this is not the case (this may change in future, when filters
also use ConjunctionScorer). In general queries are mostly faster now than
filters, only when you cache filters, you get improvements.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Khash Sajadi [mailto:khash@sajadi.co.uk] 
Sent: Sunday, October 24, 2010 12:52 AM
To: dev@lucene.apache.org
Subject: Re: Using filters to speed up queries

 

On the topic of BooleanQuery. Would the order of the queries being added
matter? Is it clever enough to skip the second query when the first one is
returning nothing and is a MUST?

On 23 October 2010 23:47, Khash Sajadi <khash@sajadi.co.uk> wrote:

Thanks. Will try it. Been thinking about separate indexes but have one
worry: memory and file handle issues.

 

I'm worried that in scenarios I might end up with thousands of
IndexReaders/IndexWriters open in the process (it is Windows). How is that
going to play out with memory?

 

On 23 October 2010 23:44, Mark Harwood <markharw00d@yahoo.co.uk> wrote:

Look at BooleanQuery with 2 "must" clauses - one for the query, one for a
ConstantScoreQuery wrapping the filter.
BooleanQuery should then use automatically use skips when reading matching
docs from the main query and skip to the next docs identified by the filter.
Give it a try, otherwise you may be looking at using separate indexes



On 23 Oct 2010, at 23:18, Khash Sajadi wrote:

> My index contains documents for different users. Each document has the
user id as a field on it.
>
> There are about 500 different users with 3 million documents.
>
> Currently I'm calling Search with the query (parsed from user) and
FieldCacheTermsFilter for the user id.
>
> It works but the performance is not great.
>
> Ideally, I would like to perform the search only on the documents that are
relevant, this should make it much faster. However, it seems Search(Query,
Filter) runs the query first and then applies the filter.
>
> Is there a way to improve this? (i.e. run the query only on a subset of
documents)
>
> Thanks



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 

 


Mime
View raw message