lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Design guidance - search strategy
Date Thu, 04 Dec 2008 13:36:42 GMT
It's generally a bad idea to iterate a Hits object. In fact, Hits
is deprecated in recent versions of Lucene. The underlying
problem is that the query is re-executed every 100 responses
or so.

First suggestion, create a Filter by iterating over your
docid field and use that in your searches see
several of the Searcher.search variants.

Second suggestion, use one of the collector classes rather than
Hits, e.g. TopDoc*, TopFieldDoc*, whichever suits.


Best
Erick

On Thu, Dec 4, 2008 at 7:59 AM, Ian Vink <ianvink@gmail.com> wrote:

> I have documents with this simple schema in Lucene which I can not change.
> docid: (int)
> contents: (text)
>
> The user is given a list of 10,000 documents in a tree which they select to
> search, usually they select 5000 or so.
>
> I only want to search those 5000 documents. I have the 'id' fields. That is
> all.
>
> I do this now:
>
> Get the 'Hits' for all documents.
> Loop through all Hits looking for any 'docid' that is in the 5000 selected
> by the user
> Add found docs to a collection of found documents and return that to the
> UI.
>
>
> Is there a better way of doing this?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message