lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Terms.getSumTotalTermFreq() in Lucene 4.0
Date Sat, 05 Jan 2013 13:23:31 GMT
Hi,

The next version won't have a fix for this unless someone opens an
issue / posts a patch.

Mike McCandless

http://blog.mikemccandless.com

On Fri, Jan 4, 2013 at 7:59 PM, 장용석 <need4spd@gmail.com> wrote:
> Hello Mike.
> Thanks for your reply.
>
> It's not an important issue.
> I'll waiting for next release version including this patch.
>
> Thanks.
>
> 2013/1/4 Michael McCandless <lucene@mikemccandless.com>
>
>> The problem is that the TermVectorsFormat for the default codec
>> (Lucene40TermVectorsFormat) does not store this statistic
>> per-document, currently.  We could in theory fix this ... maybe open
>> an issue / make a patch if it's important?
>>
>> -1 return value is actually "valid": it means this statistic is not
>> available.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> On Fri, Jan 4, 2013 at 2:39 AM, 장용석 <need4spd@gmail.com> wrote:
>> > Hello.
>> > I have some questions.
>> >
>> > Document 1 : "learning perl learning java learning ruby"
>> > Document 2 : "perl test"
>> >
>> > I have indexed this documents, with StoreTermVectors(true) and
>> > IndexOptions.DOCS_AND_FREQS.
>> > Field name is "f".
>> >
>> > And I executed this code.
>> >
>> > IndexReader ir = IndexReader.open(dir);
>> > Terms terms = ir.getTermVector(0, "f");
>> >
>> > System.out.println(terms.getDocCount()); -> 1
>> > System.out.println(terms.getSumDocFreq()); -> 4
>> > System.out.println(terms.getSumTotalTermFreq()); -> -1
>> >
>> > I think this terms instance acts like a single-document inverted index.
>> >
>> > So getDocCount is 1 (single document), and getSumDocFreq is 4. (because
>> > each term's docFreq is 1)
>> > Is this right?
>> >
>> > But I can't understand why getSumTotalTermFreq method return -1.
>> > In javadoc getSumTotalTermFreq is sum of
>> >
>> TermsEnum.totalTermFreq<eclipse-javadoc:%E2%98%82=aboutLucene4/lib%5C/lucene-core-4.0.0.jar%3Corg.apache.lucene.index(Terms.class%E2%98%83Terms~getSumTotalTermFreq%E2%98%82TermsEnum%E2%98%82totalTermFreq>
>> > .
>> >
>> > I think in Document1, each term's totalTermFreqs are [learning, 3],
>> [java,
>> > 1], [perl, 1], [ruby, 1].
>> > So getSumTotalTermFreq method's result is 6 not -1.
>> >
>> > Why temrs.getSumTotalTermFreq() method return -1?
>> >
>> >
>> > Thanks in advance.
>> > --
>> > DEV용식
>> > http://devyongsik.tistory.com
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>
> --
> DEV용식
> http://devyongsik.tistory.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message