lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: search timeout
Date Fri, 16 Mar 2007 18:40:51 GMT

: Nutch recently added a search query timeout (NUTCH-308).  Are there any
: plans to add such functionality to the Lucene HitCollector directly?  Or
: is there some reason that this is a bad idea?

Quickly skimming the patch in that Issue, Nutch seems to have done what
has been discussed previously on this list: using a HitCollector which
throws a RuntimeException if a certain amount of time has elapsed.

this is something anyone using the Lucene API can do as long as they use a
HitCollector ... the Nutch impl seems to ctually spin up a seperate thread
for each request rather then comparing timestamps after each doc is
colelcted (and interesting choice that both frightens and excites me)

a HitCollector like this could easily be "promoted" up in the Lucene code
base ... the real question is would we want timeout info like this to be
exposed in some of the simpler Searcher APIs (ie: that return TopDocs or
Hits) and if so how do you signal that it's only a partial result set?

: I'm using Solr which doesn't seem to support search timeouts.  It seems
: that it would make sense to add the feature at the Lucene level rather
: than implement the feature in each derivative.

even if it were added to the Lucene core, some fairly heavy questions
would have to be answered before encorporating this into Solr: mainly what
do do about Caching ... do you cache the partial results? do you attempt
to continue searching after returning the partial results? the next time a
request comes in and partial results are cached, do you try to pick up
where you left off since you've got a threshold of time available?

because of these issues, Solr may need a custom solution to this situation.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message