lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: RangeFilter performance problem using MultiReader
Date Fri, 10 Apr 2009 20:13:05 GMT
You got a lot of answers and questions about your index structure. Now
another idea, maybe this helps you to speed up your RangeFilter:

What type of range do you want to query? From your index statistics, it
looks like a numeric/date field from which you filter very large ranges. If
the values are very fine-grained and so you hit a lot of terms for the
range, you might consider using TrieRangeFilter, which is a new contrib
module in the yet unreleased Lucene 2.9:
http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apach
e/lucene/search/trie/package-summary.html

The name and API may change before release (if it moves to core), but you
can try it out, it is working stable and currently runs in productive
websites! It works for int, long, double, float and Date values (if encoded
using Date.getTime() as long).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Raf [mailto:r.ventaglio@gmail.com]
> Sent: Friday, April 10, 2009 4:38 PM
> To: java-user@lucene.apache.org
> Subject: RangeFilter performance problem using MultiReader
> 
> Hi,
> we are experiencing some problems using RangeFilters and we think there
> are
> some performance issues caused by MultiReader.
> 
> We have more or less 3M documents in 24 indexes and we read all of them
> using a MultiReader.
> If we do a search using only terms, there are no problems, but it if we
> add
> to the same search terms a RangeFilter that extracts a large subset of the
> documents (e.g. 500K), it takes a lot of time to execute (about 15s).
> 
> In order to identify the problem, we have tried to consolidate the index:
> so
> now we have the same 3M docs in a single 10GB index.
> If we repeat the same search using this index, it takes only a small
> fraction of the previous time (about 2s).
> 
> Is there something we can do to improve search performance using
> RangeFilters with MultiReader or the only solution is to have only a
> single
> big index?
> 
> Thanks,
> Raf


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message