lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Sokolov <msoko...@safaribooksonline.com>
Subject Re: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)
Date Fri, 01 Mar 2013 12:41:18 GMT
On 2/28/2013 5:05 PM, Uwe Schindler wrote:
> ...  Collector instead of HitCollector (like your ancient Lucene from 2.4), you have
to respect the new semantics that are *different* to old HitCollector. Collector works with
low-level atomic readers (also in Lucene 3.x), the calls to the "collect(int)" method are
*not* using global document IDs, so using a IndexReader from outside does not work and will
never work - PERIOD: The document IDs are only *relative* to the atomic reader that was passed
to the collector by setNextReader() before a sequence of collect() calls. To make global docIds
out of it, you may use readerContext.docBase, but this is slower than using the low-level
atomic reader.
>
Uwe, thanks for this lucid explanation!  I wonder if you wouldn't mind 
elaborating a bit on the slowdown you refer to from using docBase to 
absolutize docIDs.  I have a use case where I need to pass control to my 
caller, allowing them to *pull* results - so I don't know how many I 
will need.  In the case where documents are returned in(docID) order, 
the code is actually pretty straightforward: I iterate over the atomic 
readers and pull results from each in turn.  Are you saying that is 
slower because it prevents multi-threading, or is there some other reason?

-Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message