lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: TopDocCollector vs TopScoreDocCollector (semantics changed in 4.0, not backward comptabile)
Date Thu, 28 Feb 2013 22:05:19 GMT
Hi,

This is not a bug in Lucene 4.0. This behavior is unchanged since Lucene 2.9/3.0, you just
don't read javadocs and you just don't seem to understand the changes since Lucene 2.9.

I just repeat one final time: Collector is a low level search component in Lucene and was
introduced in Lucene 2.9 to replace the old "HitCollector". So if you upgrade your code to
use Collector instead of HitCollector (like your ancient Lucene from 2.4), you have to respect
the new semantics that are *different* to old HitCollector. Collector works with low-level
atomic readers (also in Lucene 3.x), the calls to the "collect(int)" method are *not* using
global document IDs, so using a IndexReader from outside does not work and will never work
- PERIOD: The document IDs are only *relative* to the atomic reader that was passed to the
collector by setNextReader() before a sequence of collect() calls. To make global docIds out
of it, you may use readerContext.docBase, but this is slower than using the low-level atomic
reader.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: saisantoshi [mailto:saisantoshi76@gmail.com]
> Sent: Thursday, February 28, 2013 10:55 PM
> To: java-user@lucene.apache.org
> Subject: RE: TopDocCollector vs TopScoreDocCollector (semantics changed in
> 4.0, not backward comptabile)
> 
> Thanks a lot. Really appreciate your help here.
> 
> I have read through the document and understand that the IndexReader
> uses sub readers (to look into the index files) and AtomicReader does not.
> But how does this affect from the search stand point of view. I think search
> results should be consistent for both the readers.
> 
> It happened to be my case that the search was behaving weird ( returning
> incorrect Documents) until I am using the IndexReader and started to work
> fine when I changed it back to "AtomicReader". Not sure if this has solved
> the problem by changing it to AtomicReader? This seems to be a bug in the
> IndexReader in 4.0
> 
> // indexReader.document(doc) is giving incorrect result in 4.0
> 
> // atomicReader.document(doc) is giving the correct result.
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/TopDocCollector-vs-
> TopScoreDocCollector-semantics-changed-in-4-0-not-backward-comptabile-
> tp4035806p4043788.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message