lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Raf <r.ventag...@gmail.com>
Subject Re: RangeFilter performance problem using MultiReader
Date Sat, 11 Apr 2009 07:21:57 GMT
Thanks Uwe,
I had already read about TrieRangeFilter on this mailing list and I thought
it could be useful to solve my problem.
I think I will trie it for test purposes.

Unfortunately, I have now to solve the problem in a production system and I
would like to avoid using a yet unreleased version. So I think that my best
choice, at the moment, is to consolidate my indexes and waiting until this
interesting new feature will be available in the official release.

Thanks a lot to all of you,
Raf


On Fri, Apr 10, 2009 at 10:13 PM, Uwe Schindler <uwe@thetaphi.de> wrote:

> You got a lot of answers and questions about your index structure. Now
> another idea, maybe this helps you to speed up your RangeFilter:
>
> What type of range do you want to query? From your index statistics, it
> looks like a numeric/date field from which you filter very large ranges. If
> the values are very fine-grained and so you hit a lot of terms for the
> range, you might consider using TrieRangeFilter, which is a new contrib
> module in the yet unreleased Lucene 2.9:
>
> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apach
> e/lucene/search/trie/package-summary.html<http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/all/org/apach%0Ae/lucene/search/trie/package-summary.html>
>
> The name and API may change before release (if it moves to core), but you
> can try it out, it is working stable and currently runs in productive
> websites! It works for int, long, double, float and Date values (if encoded
> using Date.getTime() as long).
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
> > -----Original Message-----
> > From: Raf [mailto:r.ventaglio@gmail.com]
> > Sent: Friday, April 10, 2009 4:38 PM
> > To: java-user@lucene.apache.org
> > Subject: RangeFilter performance problem using MultiReader
> >
> > Hi,
> > we are experiencing some problems using RangeFilters and we think there
> > are
> > some performance issues caused by MultiReader.
> >
> > We have more or less 3M documents in 24 indexes and we read all of them
> > using a MultiReader.
> > If we do a search using only terms, there are no problems, but it if we
> > add
> > to the same search terms a RangeFilter that extracts a large subset of
> the
> > documents (e.g. 500K), it takes a lot of time to execute (about 15s).
> >
> > In order to identify the problem, we have tried to consolidate the index:
> > so
> > now we have the same 3M docs in a single 10GB index.
> > If we repeat the same search using this index, it takes only a small
> > fraction of the previous time (about 2s).
> >
> > Is there something we can do to improve search performance using
> > RangeFilters with MultiReader or the only solution is to have only a
> > single
> > big index?
> >
> > Thanks,
> > Raf
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message