lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cam Bazz" <camb...@gmail.com>
Subject Re: IndexSearcher.search
Date Tue, 16 Sep 2008 03:34:44 GMT
In cases where we dont know the possible number of hits -- and wanting
to test the new 2.4 way of doing things,

could I use custom hitcollectors for everything? any performance
penalty for this?

from what I understand both TopDocCollector and TopDocs will try to
allocate an array of Integer.MAX_VALUE if I specify that for the page
size?

Best.

On Tue, Sep 16, 2008 at 5:59 AM, Daniel Noll <daniel@nuix.com> wrote:
> Otis Gospodnetic wrote:
>>
>> Hi,
>>
>> Check the Hits javadoc:
>>
>>  * @deprecated Hits will be removed in Lucene 3.0. <p>
>>  * Instead e. g. {@link TopDocCollector} and {@link TopDocs} can be
>> used:<br>
>>  * <pre>
>>  *   TopDocCollector collector = new TopDocCollector(hitsPerPage);
>>  *   searcher.search(query, collector);
>>  *   ScoreDoc[] hits = collector.topDocs().scoreDocs;
>>  *   for (int i = 0; i < hits.length; i++) {
>>  *     int docId = hits[i].doc;
>>  *     Document d = searcher.doc(docId);
>>  *     // do something with current hit
>>  *     ...
>>  * </pre>
>
> Related topic: what if we need all the hits and not just the first 100?
>
> TopDocCollector has a couple of drawbacks, one is that you need to know the
> number of hits before the query, and you can't overallocate as it will run
> you out of memory.  I take it this is the suggested workaround for that:
>
>  TopDocCollector collector = new TopDocCollector(PAGE_SIZE);
>  searcher.search(query, collector);
>  TopDocCollector collector2 = new TopDocCollector(
>    collector.getTotalHits());
>  searcher.search(query, collector2);
>
> And then wrap it up such that the initial search's results are immediately
> available and the rest load on demand?  Sounds almost exactly like Hits to
> me though, with the drawback being we have to write it ourselves instead of
> it being part of Lucene.
>
> Or do we make a replacement for TopDocCollector which doesn't have this
> drawback, and uses an alternative for PriorityQueue which allows its array
> to grow?
>
> Daniel
>
>
> --
> Daniel Noll
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message