lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From comparis.ch - Roman Baeriswyl <roman.baeris...@comparis.ch>
Subject RE: Filter Performance
Date Thu, 20 Jan 2011 13:48:43 GMT
Thanks for the answer. That does make sense.
It first gets thru all (not only those which could pass the filter) terms available and investigates
all terms which match any of the wildcard queries. And that could take quite some time if
I got leading wildcard queries.

Guess I'll try another approach then with "reverse" indexing the fields which can be searched
and then transform the leading wildcard queries to following wildcards by reversing them too.

-----Original Message-----
From: Uwe Schindler [mailto:uwe@thetaphi.de]
Sent: Donnerstag, 20. Januar 2011 14:29
To: java-user@lucene.apache.org
Subject: RE: Filter Performance

The reason for this is that the filters and other boolean clauses are
applied during result collection. But wildcard query first needs to
investigate all terms that match and this is done before the results are
collected. And this step takes the time (especially before Lucene 4.0).

There is no way to change this.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: comparis.ch - Roman Baeriswyl [mailto:roman.baeriswyl@comparis.ch]
> Sent: Thursday, January 20, 2011 10:50 AM
> To: 'java-user@lucene.apache.org'
> Subject: Filter Performance
>
> Hi all
>
> I've got an Index with a few 100k documents and I want to run a rather
> complex wildcard (incl. leading wildcards) query on it.
> The wildcard query takes about 2 seconds to complete.
> Now, I want to limit the items on which the wildcard query will be
executed.
> Let's say, I want to limit the items to those which have the field
> "ProductName" set to "milk" (this query itself runs in less than 5
milliseconds
> and returns about 100 items).
> So, I tried different things but everything resulted in having the exact
same
> execution time like without this  filter. Even if I run the query multiple
times
> in a row with the same Query and Filter items.
>
> Here's what I tried:
>
> -          Adding "+ProductName:milk" to the Query
>
> -          Added FieldCacheTermsFilter("ProductName ", new String[] {
"milk" })
> as Filter
>
> -          Wrapped the Filter in CachingWrapperFilter
>
> -          Used  BooleanQuery filterQuery = new BooleanQuery();
> filterQuery.add(new TermQuery(new Term("ProductName", "milk ")),
> BooleanClause.Occur.MUST); and wrapped it with a QueryWrapperFilter and
> also tried wrapping it in a FilteredQuery
>
> Nothing improved the searchspeed, it had always the same speed as when
> he parsed thru all documents without any pre-filtering.
>
> Is this pre-filtering different than I thought? Am I doing something
wrong?
> Does the index need to be build somehow special for this to work?
>
> Thanks for your help
> //Roman
>
> ________________________________
> Holen Sie die besten Elektronik-Aktionen direkt auf Ihr Facebook-Profil:
> http://www.facebook.com/pages/Preissturz/218831069608
>
> Die besten Elektronik-Aktionen auf Twitter: http://twitter.com/preissturz1


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Holen Sie die besten Elektronik-Aktionen direkt auf Ihr Facebook-Profil: http://www.facebook.com/pages/Preissturz/218831069608

Die besten Elektronik-Aktionen auf Twitter: http://twitter.com/preissturz1

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message