lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roman Chyla <roman.ch...@gmail.com>
Subject TermEnum.docFreq() includes deleted docs
Date Tue, 17 Jul 2012 16:44:56 GMT
Hi,

Tests show that TermEnum.docFreq() returns sum of all docs, including
the deleted ones. Which seems to (indirectly) contradict the javadoc

This frequency count is used to compute uninverted index
(DocTermOrds.uninvert()). The code goes like:

      final int df = te.docFreq();
      if (df <= maxTermDocFreq) {


So, if I happen to have many deleted documents, and maxTermDocFreq is
low, then the term will be excluded (even if the freq of the livedocs
is OK). Most likely, the cache will be incomplete.

Can it be considered a feature? Or is it a bug?

Thanks,

  roman

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message