lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com.INVALID>
Subject Re: Tf and Df in lucene
Date Mon, 15 Jun 2015 13:54:32 GMT
Hi Hummel,

regarding df,

Term term = new Term(field, word);
TermStatistics termStatistics = searcher.termStatistics(term, TermContext.build(reader.getContext(),
term));
System.out.println(query + "\t totalTermFreq \t " + termStatistics.totalTermFreq());
System.out.println(query + "\t docFreq \t " + termStatistics.docFreq());

regarding tf,

Term term = new Term(field, word);
Bits bits = MultiFields.getLiveDocs(reader);
PostingsEnum postingsEnum = MultiFields.getTermDocsEnum(reader, bits, field, term.bytes());

if (postingsEnum == null) return;

int max = 0;
while (postingsEnum.nextDoc() != PostingsEnum.NO_MORE_DOCS) {
final int freq = postingsEnum.freq();
int docID = postingsEnum.docID();}


Ahmet




On Monday, June 15, 2015 9:12 AM, Shay Hummel <shay.hummel@gmail.com> wrote:
Hi

I was wondering, what is the easiest way to get the term frequency of a
term t in document d, namely tf(t,d) ?
In the same spirit - what is the easieast way the get the document
frequency of a term in the collection, i.e. how many contain the term t,
namely df(t) ?

Regards,
Shay

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message