lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Len Takeuchi" <ltakeu...@jostleme.com>
Subject RE: Using HitCollector to Collect First N Hits
Date Mon, 24 Aug 2009 16:13:44 GMT
Hi Simon,

> that is what my first guess was and I'm pretty sure that the long time
> is taken before the documents get scored. A short prefix can easily
> expand to thousands of terms, do you encounter
> TooManyClausesExceptions and in turn do you set
> BooleanQuery#setMaxClauseCount() to a higher value than 1024?
> I wonder if BooleanQuery#setAllowDocsOutOfOrder(true) would give you
> any performance hit if you don't care about the order of how the docs
> come in. Any idea how many terms your prefix query expands to?


I looked into it and the prefix query we are finding to be slow expands
to about 150 terms (and hence we're not getting
TooManyClausesExceptions).  


> one more thing... while I have no idea about your usecase if you don't
> care about the score you could you expand the terms yourself just like
> PrefixQuery does.

I was trying out a few things and if term expansion is limited to about
20 or so the performance becomes okay for us.  I will have to try to
find some way to limit the expansion for some queries.  I'll look into
expanding the terms myself as you suggest.

Thanks for your help,
Len 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message