lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <>
Subject Re: Getting word count
Date Fri, 19 Oct 2001 20:59:59 GMT
>>You cannot simply count the number of times the method 
>>collect() is called on your collector because some queries 
>>may result in 
>>the same document being selected more than once and so you'd 
>>end up with 
>>a double-count. (Can anyone confirm that this is the case?)
>It should not be the case.  The collect() method should be called at most
>once per document.
This is a good news! This would make counting that much more efficient. 
My main concern was in the BooleanScorer, and I just verified that I was 
worried needlessly - it maintains its own hashtable to avoid double 
counting. On a related issue, are there any guarantees about the order 
of document numbers in the calls to collect()?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message