lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Len Takeuchi" <>
Subject RE: Using HitCollector to Collect First N Hits
Date Mon, 24 Aug 2009 16:13:44 GMT
Hi Simon,

> that is what my first guess was and I'm pretty sure that the long time
> is taken before the documents get scored. A short prefix can easily
> expand to thousands of terms, do you encounter
> TooManyClausesExceptions and in turn do you set
> BooleanQuery#setMaxClauseCount() to a higher value than 1024?
> I wonder if BooleanQuery#setAllowDocsOutOfOrder(true) would give you
> any performance hit if you don't care about the order of how the docs
> come in. Any idea how many terms your prefix query expands to?

I looked into it and the prefix query we are finding to be slow expands
to about 150 terms (and hence we're not getting

> one more thing... while I have no idea about your usecase if you don't
> care about the score you could you expand the terms yourself just like
> PrefixQuery does.

I was trying out a few things and if term expansion is limited to about
20 or so the performance becomes okay for us.  I will have to try to
find some way to limit the expansion for some queries.  I'll look into
expanding the terms myself as you suggest.

Thanks for your help,

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message