lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: A lot of short documents, optimal query?
Date Fri, 11 Nov 2005 07:07:55 GMT
Thanks Hoss,

I've looked intio it and you were absolutely right,
could not be simpler.

Two quick ones on the same topic (my personal
education like questions):

- What is the purpose of hasCode and equals methods in
XxxFilter? (this is a question about actual usage in
Lucene, not java elementary :)

- What would be recommended usage of the SetFilter?
a) to use search(Query, SetFilter, HitCollector)
or
b) search(Query, HitCollector) + I do AND inside
hitCollector

Option a) could be faster because inside collect()
method in HitCollector there is no way to skip to the
first set bit of the SetFilter (bitset of the Query is
not visible) 

When we are done with testing (takes a while), I will
post results. Also, If somebody has interests in
SetFilter, I see no problems to post it back.

Agan, many thanks for usefull tip! 



--- Chris Hostetter <hossman_lucene@fucit.org> wrote:

> 
> : (
> : +(
> : (+raimonds +marschan)
> : (+raimonds +marschol)
> : (+raimonds +marschel)
> : (+raimonds +marschalfr)
> : (+raimonds +marschalek)
> : (+raimonds +marscha)
> : ...
> : )
> : +(ZIPS:22* ZIPS:21* ZIPS:20* ZIPS:23* ZIPS:245*
> : ZIPS:246* ZIPS:247* ZIPS:240* ZIPS:241* ZIPS:242*
> : ZIPS:243* ZIPS:254* ZIPS:253* ZIPS:255* ZIPS:256*
> : ZIPS:257* ZIPS:295* ZIPS:296* ZIPS:273* ZIPS:274*
> : ZIPS:275* ZIPS:276* ZIPS:192* ZIPS:190*)
> : )
> 
> independent of how short/long your documents are,
> using RangeFilters on
> your ZIPS field is going to be more efficient then
> PrevixQueries ... I'd
> bet money it will even be more efficient then making
> a two character
> prefix_ZIPS field and doing a TermQuery on it -- and
> there's no reason not
> to use a Filter if you dont' care about the score.
> 
> take a look at RangeFilter in SVN, even if you are
> using 1.4.3 it should
> be combatible.  Also take a look at ChainedFilter as
> a way to compose lots
> of individual RangeFilters...
> 
>
http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/miscellaneous/src/java/org/apache/lucene/misc/ChainedFilter.java
> 
> You can probably get additional speed ups by using
> Filters on whatever
> your default name search field is, google for
> "lucene Hoss SetFilter" to
> see a previous discussion where i suggested
> something similar.
> 
> -Hoss
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 



		
___________________________________________________________ 
Yahoo! Model Search 2005 - Find the next catwalk superstars - http://uk.news.yahoo.com/hot/model-search/

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message