lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: A lot of short documents, optimal query?
Date Wed, 09 Nov 2005 19:08:32 GMT

: (
: +(
: (+raimonds +marschan)
: (+raimonds +marschol)
: (+raimonds +marschel)
: (+raimonds +marschalfr)
: (+raimonds +marschalek)
: (+raimonds +marscha)
: ...
: )
: +(ZIPS:22* ZIPS:21* ZIPS:20* ZIPS:23* ZIPS:245*
: ZIPS:246* ZIPS:247* ZIPS:240* ZIPS:241* ZIPS:242*
: ZIPS:243* ZIPS:254* ZIPS:253* ZIPS:255* ZIPS:256*
: ZIPS:257* ZIPS:295* ZIPS:296* ZIPS:273* ZIPS:274*
: ZIPS:275* ZIPS:276* ZIPS:192* ZIPS:190*)
: )

independent of how short/long your documents are, using RangeFilters on
your ZIPS field is going to be more efficient then PrevixQueries ... I'd
bet money it will even be more efficient then making a two character
prefix_ZIPS field and doing a TermQuery on it -- and there's no reason not
to use a Filter if you dont' care about the score.

take a look at RangeFilter in SVN, even if you are using 1.4.3 it should
be combatible.  Also take a look at ChainedFilter as a way to compose lots
of individual RangeFilters...

http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/miscellaneous/src/java/org/apache/lucene/misc/ChainedFilter.java

You can probably get additional speed ups by using Filters on whatever
your default name search field is, google for "lucene Hoss SetFilter" to
see a previous discussion where i suggested something similar.

-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message