lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Terms.getSumTotalTermFreq() in Lucene 4.0
Date Fri, 04 Jan 2013 14:02:20 GMT
The problem is that the TermVectorsFormat for the default codec
(Lucene40TermVectorsFormat) does not store this statistic
per-document, currently.  We could in theory fix this ... maybe open
an issue / make a patch if it's important?

-1 return value is actually "valid": it means this statistic is not available.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 4, 2013 at 2:39 AM, 장용석 <need4spd@gmail.com> wrote:
> Hello.
> I have some questions.
>
> Document 1 : "learning perl learning java learning ruby"
> Document 2 : "perl test"
>
> I have indexed this documents, with StoreTermVectors(true) and
> IndexOptions.DOCS_AND_FREQS.
> Field name is "f".
>
> And I executed this code.
>
> IndexReader ir = IndexReader.open(dir);
> Terms terms = ir.getTermVector(0, "f");
>
> System.out.println(terms.getDocCount()); -> 1
> System.out.println(terms.getSumDocFreq()); -> 4
> System.out.println(terms.getSumTotalTermFreq()); -> -1
>
> I think this terms instance acts like a single-document inverted index.
>
> So getDocCount is 1 (single document), and getSumDocFreq is 4. (because
> each term's docFreq is 1)
> Is this right?
>
> But I can't understand why getSumTotalTermFreq method return -1.
> In javadoc getSumTotalTermFreq is sum of
> TermsEnum.totalTermFreq<eclipse-javadoc:%E2%98%82=aboutLucene4/lib%5C/lucene-core-4.0.0.jar%3Corg.apache.lucene.index(Terms.class%E2%98%83Terms~getSumTotalTermFreq%E2%98%82TermsEnum%E2%98%82totalTermFreq>
> .
>
> I think in Document1, each term's totalTermFreqs are [learning, 3], [java,
> 1], [perl, 1], [ruby, 1].
> So getSumTotalTermFreq method's result is 6 not -1.
>
> Why temrs.getSumTotalTermFreq() method return -1?
>
>
> Thanks in advance.
> --
> DEV용식
> http://devyongsik.tistory.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message